Micah P. Dombrowski / Jun 29 2019

How to Handle Data in Julia

1. Files

There are a number of ways to bring data into a Nextjournal article. Code cells run in Docker containers, so we can use Julia methods to download to the filesystem, or Linux tools in a Bash cell (e.g. curl or wget). However, we can also upload files directly to the article via the insert menu.

cubic.csv

Files uploaded in this way are immutably stored in a persistent, versioned database. We have read-only access to them under the /data directory; however, they are stored with a hashed name. To access them in a code cell, you insert a reference by filename using Ctrl-E or Cmd-E. Here, we can see our uploaded file's contents with a Bash cell:

cat 
cubic.csv

Usage is the same in any code cell—the reference appears as a string constant pointing to the stored file:

println(
cubic.csv
)

Thus, you can directly use these references in most functions that take a file path.

dataz = open(
cubic.csv
) do file [ parse(Float64,x) for x in readlines(file) ] end using Plots Plots.plot(dataz, title="Cubic Function", xlabel="x", ylabel="f(x)")

In some cases a file stored in /data won't work: you may need it to have its original filename and extension, or to open it with write permissions. A local copy of the file can be made using cp in a Bash cell, or in Julia.

cp(
cubic.csv
, "cubic.csv", force=true) if isfile("cubic.csv") print("Size: $(filesize("cubic.csv")) bytes") end

You start out with the root dir, /, as the working directory, but can move and add to the filesystem however you like, except for the special directories /data, /results, /usr/local/nvidia, /usr/local/cuda, and the usual Linux system directories (/dev, /proc, /sys).

cd("usr")
readdir()

To get data out of Nextjournal, you can write or copy files to the /results directory. Image files will be displayed in the article—you can right-click to download, and click to adjust display settings or add a caption.

Plots.savefig("/results/cubic.svg")

All other files will appear as a download link.

open("/results/test.txt", "w") do file
  write(file, "Some text.\nSome \"other\" text.")
end
test.txt

You can also access files that were previously placed in /results, from other cells or even the same cell if recursion is needed. These are stored in the same database as uploaded files, and references are inserted in the same way, with Cmd/Ctrl-E.

open(s -> print(read(s, String)), 
test.txt
)

Files downloaded from within cells—by either Bash tools or in-language methods—can be preserved in either of two ways. The preferred method is to download the file directly into /results, thus ensuring the file will be saved and versioned for reference elsewhere in the article.

download("http://nextjournal.com/images/logo.svg", "/results/logo.svg")

You can even reference them in other runtimes and languages.

cp 
logo.svg
/logo.svg

The other way to preserve files downloaded into the filesystem is to export the runtime as a new environment. While this isn't as flexible as the above method, it can make sense if the environment is being exported anyway for use in the article.

2. Cell Data

A Nextjournal language cell will attempt to display its 'return value', i.e. the . Strings and simple variables will be printed, while more complicated structures (arrays, dictionaries) will appear as an expandable tree.

dataz

This display is possible because Nextjournal exports the final returned value in a cell. In addition to displaying the value this means that, like files, we can reactively reference data from other cells without inheriting their process state. Data references are inserted like file refs, with Cmd/Ctrl-E.

These reactive references even work between different languages, including client-side Clojurescript and Javascript.

/* Referencing Julia data in a Javascript cell! */

var data = 
julia data
for (var i=1, xes=[]; i <= data.length; i++) xes.push(i) var trace = [{ type: 'scatter', mode: 'lines', name: 'cubic', x: xes, y: data }] var layout = { title: 'Cubic Function', xaxis: { title: 'x' }, yaxis: { titl: 'f(x)' }} Nextjournal.plot(trace, layout)

References can also exist between text and code, either using numbers in text to specify the input of a cell, or taking the result of a cell and using it in a paragraph. To insert a referable number in-text, select the Number entry in the Cmd/Ctrl-E menu.

Let's plot an n-lobed parametric curve:

function pcurve(n) {
  if (n >= 3) {
    k = n - 1
    var ts = [...Array(360).keys()].map(function(x) { return x*2*Math.PI/360; })
    var xs = ts.map(function(t) { return Math.cos(k*t)/2+Math.sin(t)/3 })
    var ys = ts.map(function(t) { return Math.sin(k*t)/2+Math.cos(t)/3 })  
    
    var trace = [{ type: 'scatter', mode: 'lines', name: 'cubic', x: xs, y: ys }]
    var layout = { title: n + '-Lobe Curve',
      xaxis: { title: 'x', range: [-1.5,1.5] }, 
      yaxis: { titl: 'f(x)', range: [-1.5, 1.5] }}

    return Nextjournal.plot(trace, layout)
  }
}

pcurve

Note that in-text number objects are reactive. In both the draft and published views you can click on the number and scrub left and right with the pointer to increase and decrease the number. For client-side languages this will have an immediate effect.

Here we'll use the function returned above with one reference, and plot a -lobed curve with another. Try out scrubbing over the number object in the previous sentence to instantly modify the plot!

parametric curve
(
n
)

Finally, by inserting an in-text reference we can display the number object above,  
, or the results of any cell, such as the one below:  [FIXME, Issue #3339]

["abc", 123]