Parens for Python - Sci SpaCy

NLP for scientific text

We are going to explore some more Python libraries through the use of libpython-clj.

This time, we are going to look at Sci SpaCy

{:deps
 {org.clojure/clojure {:mvn/version "1.10.1"}
  clj-python/libpython-clj {:mvn/version "1.36"}}}
deps.edn
Clojure

Install the python dependencies and model

pip3 install spacy scispacy
pip3 install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.2.4/en_core_sci_sm-0.2.4.tar.gz
34.3s
Clj & Python env (Bash in Clojure)

We are going to be following the tutorial from https://allenai.github.io/scispacy/

Load up the model and analyze

The first thing we need to do is to load up the namespace, and model

(ns gigasquid.sci-spacy
  (:require [libpython-clj.require :refer [require-python]]
            [libpython-clj.python :as py :refer [py. py.. py.-]]))
(require-python '[spacy :as spacy])
(require-python '[scispacy :as scispacy])
(def nlp (spacy/load "en_core_sci_sm"))
17.0s
Clj & Python env (Clojure)
gigasquid.sci-spacy/nlp

Now, we are ready to analyze some text:

(def text "Myeloid derived suppressor cells (MDSC) are immature 
  myeloid cells with immunosuppressive activity. 
  They accumulate in tumor-bearing mice and humans 
  with different types of cancer, including hepatocellular 
  carcinoma (HCC).")
(def doc (nlp text))
0.1s
Clj & Python env (Clojure)
gigasquid.sci-spacy/doc

Let's find all the entities.

(map (fn [ent] (py.- ent text)) (py.- doc ents))
0.0s
Clj & Python env (Clojure)
List(12) ("Myeloid", "suppressor cells", "MDSC", "immature", "myeloid cells", "immunosuppressive activity", "accumulate", "tumor-bearing mice", "humans", "cancer", "hepatocellular carcinoma", "HCC")

The same with the sentences.

(map (fn [sent] (py.- sent text)) (py.- doc sents))
0.0s
Clj & Python env (Clojure)
List(2) ("Myeloid derived suppressor cells (MDSC) are immature myeloid cells with immunosuppressive activity. ", "They accumulate in tumor-bearing mice and humans with different types of cancer, including hepatocellular carcinoma (HCC).")

We can even graph things!

(require-python '[spacy.displacy :as displacy])
(spit "results/my-pic.svg" (displacy/render (first (py.- doc sents)) :style "dep"))
0.0s
Clj & Python env (Clojure)

Want more examples? Check them out here: https://github.com/gigasquid/libpython-clj-examples

Runtimes (1)