Show more

... Can we base such functions on generative probabilistic models, and how do such function perform on standard benchmark datasets, like datasets from the Text Retrieval Conference (TREC)? Furthermore, can these function be implement efficiently using an inverted-file search engine? How does the efficiency compare to the efficiency of tf.idf ranking using the cosine similarity function on large datasets like the Clueweb09 data?

I tested IRIS.AI at with the following research question (that was the minimum amount of text it said it needs):

Can we model tf.idf ranking in IR by using probability theory?

Retrieval ranking functions like the many variants of tf.idf are mostly based on heuristics. Researchers have tried thousands of variants of such functions, and in the process discovered very effective ranking functions, such as the well-known BM25 ranking formula...

"Protecting Yourself from Identity Theft" by Bruce Schneier

"Ten years ago, I could have given you all sorts of advice about using encryption, not sending information over email, securing your web connections, and a host of other things­ -- but most of that doesn't matter anymore. Today, your sensitive data is controlled by others, and there's nothing you can personally to do affect its security. "

#2146 "Waiting for the But" 

I gave a talk at #PyCon2019 about #mastodon, #activitypub, and how to use it with #python. And why you might want to! The video is up now:


--- 3< ---
The Black Keys' ninth studio album, 'Let's Rock' is a return to the straightforward rock of the singer/guitarist Dan Auerbach and drummer Patrick Carney's early days as a band. "When we're together we are The Black Keys, that's where that real magic is," says Auerbach, "and always has been since we were sixteen."
--- 3< ---

Folks, I’m seeing a lot of people recommend setting xpinstall.signatures.required to false to fix the Firefox extensions issue.


It disables signature checking on extensions which means that you open yourself up to malicious extensions if you install any new ones or if you have auto updates on.

Either follow the instructions here by @amolith

Or go the officially recommended (but less private) route:


Headline gold: "Sinister secret backdoor found in networking gear perfect for government espionage: The Chinese are – oh no, wait, it's Cisco again"

Show more

The "unofficial" Information Retrieval Mastodon Instance.

Goal: Make a viable and valuable social space for anyone working in Information Retrieval and related scientific research.

Everyone welcome but expect some level of geekiness on the instance and federated timelines.