Follow

Try "trec disk 4 5":
toolbox.google.com/datasetsear

Your search - trec disk 4 and 5 - did not match any datasets.

Try "trec disk 4":
toolbox.google.com/datasetsear

NIST TREC Document Database: Disk 4 - NIST Special Database 22
catalog.data.gov

So... so much for Google Search effectiveness.

PS: I could not find DBPedia Entity, nor any INEX datasets. But okay, maybe they do not have the right metadata posted. Still, google.com/search?&q=DBPedia+E does work fine!

Maybe time for IR ppl to look into this as well.

@arjen Definitely time for IR people to look into this. One of my ideas I had years ago in the 'Smart Citires' context that I never really had time to follow up (the joys of being an academic overwhelmed wih admin). But maybe in this case it's not an effetiveness issue, more a coverage one (are these datasets indexed at all?).

@ingo well, the TREC queries I mention are an example of effectiveness issues.

Indeed, relying on metadata implies lack of coverage too.

@arjen Ah, I see. Interestingly, both TREC queries you published don't give me any result. Hence my confusion.

@ingo I see - while I am sure it did on Friday, not any more for me either; even worse as that data set IS in the index. (And you see it in autocompletions.)

Bottom line: time for some IR folk to jump in.

@arjen Data set search today seems to be a bit like the good old IR days before full text was available - all relying on (bibliographic) metadata and surrogates.

@arjen Though I think it would be interesting to look into the content of some data sets as well, but that might OTOH be misleading (for instance, a news data set content might provide you with some misleading terms about events and not the data set itself, but then again there might be information needs that exactly want that). In particular dynamic data sets like (news or data) streams are an interesting sub problem here. DB+IR to the rescue! :-)

Sign in to participate in the conversation
Mastodon

The "unofficial" Information Retrieval Mastodon Instance.

Goal: Make idf.social a viable and valuable social space for anyone working in Information Retrieval and related scientific research.

Everyone welcome but expect some level of geekiness on the instance and federated timelines.