Google’s search engine for scientists upgraded for better data scouring

Google’s search engine for datasets, the cunningly named Dataset Search, is now out of beta, with new tools to better filter searches and access to almost 25 million datasets.

Dataset Search launched in September 2018, with Google hoping to slowly unify the fragmented world of online, open-access data. Although many institutions like universities, governments, and labs publish data online, it’s often difficult to find using traditional search. But by adding open-source metadata tags to their webpages, these groups can have their data indexed by Dataset Search, which now covers a huge range of information — everything from skiing injuries to volcano eruptions to penguin populations.

Google would not share any specific usage figures for the search engine, but it said “hundreds of thousands of users” have tried Dataset Search since its launch, and the reaction from the scientific community was overall positive.

Natasha Noy, a research scientist at Google AI who helped create the tool, tells The Verge that “most [data] repositories have been very responsive” and that the engine’s launch meant older scientific institutions are now taking “publishing metadata more seriously.”

“For example, [the prestigious scientific journal] Nature is changing its policies to require data sharing with proper metadata,” Noy says, highlighting a change that will make the data underpinning top-flight scientific research more accessible in future.

*“Finally! My thesis ‘Hitting The Slopes A Little Too Hard: Shattered Femurs and Broken Dreams In the 2012 World Ski Cup,’ will have the rigorous, data-based grounding it deserves.”*

New features added to Dataset Search include the ability to filter data by type (tables, images, text, etc), whether it’s free to use, and the geographic areas it covers. The engine is also now available to use on mobile and has expanded dataset descriptions.

Google says the corpus covered by the search engine — almost 25 million datasets — is only a “fraction of datasets on the web,” but a “significant” one all the same. The largest topics indexed are geosciences, biology, and agriculture, and the most common queries include “education,” “weather,” “cancer,” “crime,” “soccer,” and “dogs.” The US is also the leader in open government datasets, publishing more than 2 million online.

Noy would not comment on future plans for Dataset Search, but she says the team was thinking about a number of functions they hope would be useful, including “understanding how datasets are cited and reused” and “helping users explore datasets in Dataset Search when they don’t necessarily know what they are looking for.”

“And, of course, continuing to expand the corpus,” says Noy. There’s always more data out there.

Announcement of Karma: The Dark World for PS5 and XSX

Bleach Brave Souls Update 1.56: Detailed Fixes

PUBG Update 2.61: Improved Stability and Performance on PlayStation

The Finals Update 1.000.027: Exciting Store Additions, Stability Fixes, Quality of…

Announcement of Karma: The Dark World for PS5 and XSX

Gran Turismo 7: Latest Update for April Now Available

Switch presents Echoes: A Mysterious Adventure

Two new N64 classics now available on Nintendo Switch Online

ReFantazio: An October Rendezvous and Fresh Trailer Release

Whatever we believe we know about Samsung’s next foldable phone

Little America evaluation: little ideas about the immigrant experience

How robots will repair or damage satellites in orbit

AI-powered robotic pickers will be the next huge work revolution in…

The Scroll membership service is an innovative web innovation hack

Google’s search engine for scientists upgraded for better data scouring

Recent articles

Monkey Madness: Play with Friends in Super Monkey Ball Banana Rumble

The Aeon Gate: A Japanese Overview of Paper Mario’s Legendary Adventure

Announcement of Karma: The Dark World for PS5 and XSX

Leave a ReplyCancel reply

Editors Pick

Nickelodeon All-Star Brawl 2 Update 1.09: New Character, Bug Fixes, and Balance Changes

Cities Skylines II It’s Better Late Than Never

South Park: Snow Day Update 1.007 Fixes

Follow Us

Google’s search engine for scientists upgraded for better data scouring

Recent articles

Related

Leave a ReplyCancel reply

Editors Pick

Follow Us