Also, the latest documentation for the project will always be on the github page here: https://github.com/dakrone/clojure-opennlp which should always be up to date (even if my posts get a little stale). The updated version of this file can be found here: https://github.com/dakrone/clojure-opennlp/blob/master/examples/contextfinder.clj
Thanks for the heads up. I’ll work on updating the post for changes in the new versions.
]]>but get an error loading it:
contextfinder=> (def get-sentences (make-sentence-detector “models/EnglishSD.bin.gz”))
opennlp.tools.util.InvalidFormatException: Missing the manifest.properties! (NO_SOURCE_FILE:52)
Is this a sign of deeper config issues or a missing library showing up? I’m in a lein repl that otherwise runs the clojure-nlp features. Any pointers appreciated–I’ve a strong LISP background, but debugging where clojure meets java is opaque ATM.
]]>http://code.google.com/p/iks-project/source/browse/sandbox/iks-autotagging/trunk/README.txt
This is based on the MoreLikeThis similarity of Lucene:
http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/search/similar/MoreLikeThis.html
I was planning to use this to identify known named entities detected by the OpenNLP name finder by using a context of a few sentences around the span that contains the detected name.
]]>