<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
    <title>LITERARY MACHINES</title> <subtitle>digital libraries, books, archives.</subtitle> <link
        href="https://literarymachin.es/atom.xml" rel="self" type="application/atom+xml" />
        <link
        href="https://literarymachin.es" />
        <generator uri="https://www.getzola.org/">Zola</generator>
        <updated>2026-03-19T00:00:00+00:00</updated>
        <id>https://literarymachin.es/atom.xml</id> <entry xml:lang="en">
        <title>mkiiif, yet another static IIIF generator</title> <published>2026-03-19T00:00:00+00:00</published>
                <updated>2026-03-19T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/mkiiif/" />
                <link rel="alternate"
            href="https://literarymachin.es/mkiiif/" type="text/html" />
                <id>https://literarymachin.es/mkiiif/</id> <content
            type="html">&lt;p&gt;I revisited an &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iiif-presentation.go&quot;&gt;old Go package&lt;&#x2F;a&gt; I&#x27;ve been using over the past few years to build IIIF manifests — nothing fancy, just some glue around structs and JSON. From that I built a new CLI, &lt;code&gt;mkiiif&lt;&#x2F;code&gt;, to generate IIIF manifests from static images (tiled or not). There are plenty of similar tools out there (&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;glenrobson&#x2F;iiif-tiler&quot;&gt;iiif-tiler&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;huggingface.co&#x2F;datasets&#x2F;uv-scripts&#x2F;iiif-tiles&#x2F;blob&#x2F;main&#x2F;tile_iiif.py&quot;&gt;tile-iiif&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;IIIF-Commons&#x2F;biiif&quot;&gt;biiif&lt;&#x2F;a&gt;, ...) but none quite matched the CLI ergonomics I needed for my daily workflow.&lt;&#x2F;p&gt;
&lt;p&gt;I moved the library to this new repository &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iiif&quot;&gt;atomotic&#x2F;iiif&lt;&#x2F;a&gt;. The tool &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iiif&#x2F;tree&#x2F;main&#x2F;cmd&#x2F;mkiiif&quot;&gt;mkiiif&lt;&#x2F;a&gt; can be installed &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iiif&#x2F;releases&quot;&gt;downloading a binary release&lt;&#x2F;a&gt; or with Go:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;go install github.com&#x2F;docuverse&#x2F;iiif&#x2F;cmd&#x2F;mkiiif@latest&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;&lt;code&gt;mkiiif&lt;&#x2F;code&gt; can generate an IIIF manifest &lt;strong&gt;from a source directory containing images&lt;&#x2F;strong&gt;, or &lt;strong&gt;from a PDF file&lt;&#x2F;strong&gt; that gets exploded and converted to images via &lt;code&gt;mupdf&lt;&#x2F;code&gt;. Output images can be either untiled or static tiles generated with &lt;code&gt;vips&lt;&#x2F;code&gt;. Both approaches produce a &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif.io&#x2F;api&#x2F;image&#x2F;3.0&#x2F;compliance&#x2F;&quot;&gt;IIIF Level 0&lt;&#x2F;a&gt; compliant layout, static files that can be served from any HTTP server, with no image server required. Untiled is less efficient for large images but perfectly fine for printed books, papers, and similar material.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;mupdf.readthedocs.io&#x2F;en&#x2F;latest&#x2F;tools&#x2F;mutool.html&quot;&gt;mupdf&lt;&#x2F;a&gt; and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.libvips.org&#x2F;&quot;&gt;vips&lt;&#x2F;a&gt; are external dependencies, that need to be installed separately. They are invoked via subprocess; I chose not to add Go library wrappers around them to keep the tool simple. WASM ports of both may become viable in the future.&lt;&#x2F;p&gt;
&lt;p&gt;The CLI usage:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;Usage:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; mkiiif&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -id&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;i&lt;&#x2F;span&gt;&lt;span&gt;d&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -base&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;ur&lt;&#x2F;span&gt;&lt;span&gt;l&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -title&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;titl&lt;&#x2F;span&gt;&lt;span&gt;e&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -source&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;dir&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;|&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;pdf&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -destination&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;di&lt;&#x2F;span&gt;&lt;span&gt;r&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span&gt; [-tiles]&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;  -base&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; string&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;        Base&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; URL where the manifest will be served&lt;&#x2F;span&gt;&lt;span&gt; (e.g.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; https:&#x2F;&#x2F;example.org&#x2F;iiif&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;  -destination&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; string&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;        Output&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; directory&lt;&#x2F;span&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; a&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; subdirectory named&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;i&lt;&#x2F;span&gt;&lt;span&gt;d&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; will be created inside it, containing the images and manifest.json&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;  -id&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; string&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;        Unique&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; identifier for the manifest&lt;&#x2F;span&gt;&lt;span&gt; (e.g.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; book1&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;  -resolution&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; int&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;        Resolution&lt;&#x2F;span&gt;&lt;span&gt; (DPI) used when converting PDF pages to images via mutool (&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;default&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 150&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;  -source&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; string&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;        Path&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; to a directory of images or a PDF file to convert&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;  -tiles&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;        Generate&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; IIIF image tiles for each image using vips dzsave&lt;&#x2F;span&gt;&lt;span&gt; (requires&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; vips&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;  -title&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; string&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;        Human-readable&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; title of the manifest&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Example:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; mkiiif -base https:&#x2F;&#x2F;digital.library.org -destination .&#x2F;public -id iiif01 -source &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt;&#x2F;book.pdf -title &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;iiif 01&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Or with tiling:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; mkiiif -base https:&#x2F;&#x2F;digital.library.org -destination .&#x2F;public -id iiif01 -source &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt;&#x2F;book.pdf -title &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;iiif 01&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt; -tiles&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Both commands produce the following structure inside &lt;code&gt;.&#x2F;public&lt;&#x2F;code&gt;:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;└── iiif01&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    ├── index.html&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    ├── manifest.json&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    ├── page-001.png&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    ├── page-002.png&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    ├── page-....png&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    └── page-....png&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;└── iiif01&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   ├── index.html&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   ├── manifest.json&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   ├── page-001&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 0,0,1024,1024&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   └── 512,512&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │       └── 0&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │           └── default.jpg&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;...&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       ├── full&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   ├── 362,501&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   │   └── 0&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   │       └── default.jpg&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   └── max&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │       └── 0&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │           └── default.jpg&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       └── info.json&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;...&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;The directory can then be served from &lt;code&gt;https:&#x2F;&#x2F;digital.library.org&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;I&#x27;ve adopted this URL scheme:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;http&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;https:&#x2F;&#x2F;{&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F44747;&quot;&gt;base&lt;&#x2F;span&gt;&lt;span&gt;}&#x2F;{&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F44747;&quot;&gt;id&lt;&#x2F;span&gt;&lt;span&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    &#x2F;manifest.json — the IIIF manifest&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    &#x2F;index.html    — a simple viewer&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;So in the example above, &lt;code&gt;https:&#x2F;&#x2F;digital.library.org&#x2F;iiif01&lt;&#x2F;code&gt; opens a full viewer to browse the object. The viewer used is &lt;strong&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;d-flood.github.io&#x2F;triiiceratops&#x2F;&quot;&gt;Triiiceratops&lt;&#x2F;a&gt;&lt;&#x2F;strong&gt; — the newest viewer in the IIIF ecosystem. Built on Svelte and OpenSeadragon, is still young, but very usable, lightweight, and easy to embed and customize. It is my favourite viewer.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;code&gt;mkiiif&lt;&#x2F;code&gt; doesn&#x27;t handle metadata for now (and probably won&#x27;t) — the manifest can be easily patched to insert descriptive metadata in a later step, after image preparation, pulling from any existing datasource or metadata catalog.&lt;&#x2F;p&gt;
&lt;p&gt;Here is a full working example: &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;docuver.se&#x2F;iiif&#x2F;p3tgsk8jqt&#x2F;&quot;&gt;https:&#x2F;&#x2F;docuver.se&#x2F;iiif&#x2F;p3tgsk8jqt&#x2F;&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;p&gt;A few open questions I haven&#x27;t fully resolved:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;The main drawback of generating IIIF this way is that you end up managing a large number of files on the filesystem, and handling millions of small image tiles can be slow (and costly). This is where IIIF intersects — and overlaps — with similar practices in digital preservation, such as  BagIt, OCFL, and WARC&#x2F;WACZ. So far there&#x27;s no specification or viewer implementation that handles IIIF containers (e.g. a zip file bundling images, tiles, and the manifest). &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;orgs&#x2F;IIIF-Commons&#x2F;discussions&#x2F;4&quot;&gt;Discussions on this have been ongoing&lt;&#x2F;a&gt; in the past; I&#x27;ve recently been looking at analogous approaches like &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;orgs&#x2F;IIIF-Commons&#x2F;discussions&#x2F;4#discussioncomment-6541478&quot;&gt;GeoTIFF&lt;&#x2F;a&gt; and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;szi&#x2F;&quot;&gt;SZI&lt;&#x2F;a&gt;.&lt;&#x2F;li&gt;
&lt;li&gt;A static IIIF bundle generated with this CLI still needs to be served from an HTTP server, with the base URL defined at derivation time. Could such a bundle be opened from localhost and viewed directly in the browser? Service Workers might help here (even if HTTP is still needed), but it&#x27;s a rabbit hole I haven&#x27;t explored yet.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;The CLI is pretty bare-bones — feel free to suggest improvements or report bugs. I&#x27;ve been using it over the past weeks as part of a personal project: an amateur digital library built around a DIY book scanner I assembled at home, to preserve magazines, zines, and similar material (content NSFW and out of scope to link here).&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Build a static search for an Internet Archive Collection with Pagefind</title> <published>2026-03-07T00:00:00+00:00</published>
                <updated>2026-03-07T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/pagefind-internetarchive/" />
                <link rel="alternate"
            href="https://literarymachin.es/pagefind-internetarchive/" type="text/html" />
                <id>https://literarymachin.es/pagefind-internetarchive/</id> <content
            type="html">&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;pagefind.app&#x2F;&quot;&gt;Pagefind&lt;&#x2F;a&gt; caught my attention about a year ago, and since then I&#x27;ve adopted it in several hobby projects (nothing work-related): some blogs built with static generators like Hugo or Zola, some old HTML content distributed on CD-ROM, and some mailing list archives where I converted mbox files to HTML and then indexed them.&lt;&#x2F;p&gt;
&lt;p&gt;The tool is great, better for my needs than other JavaScript search libraries (though it&#x27;s not really fair to compare them, since they&#x27;re quite different). Pagefind is a search tool that runs entirely in the browser with zero server-side dependencies. It indexes your content into a compact binary index, using WASM to run search in the browser.&lt;&#x2F;p&gt;
&lt;p&gt;It can&#x27;t completely replace server-side search technologies like Solr or Elasticsearch, mainly because the index can&#x27;t be updated incrementally. But for many small to medium digital libraries or collections that are rarely updated once completed, it&#x27;s an extremely good tool: very fast, easy to integrate into web pages, and requires almost no maintenance.&lt;&#x2F;p&gt;
&lt;p&gt;Until now I was convinced that the only way to build an index was by reading content from existing HTML files. That changed when I listened to this &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;talkpython.fm&#x2F;episodes&#x2F;show&#x2F;538&#x2F;python-in-digital-humanities&quot;&gt;Python in Digital Humanities&lt;&#x2F;a&gt; podcast, where &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;fosstodon.org&#x2F;@davidflood&quot;&gt;David Flood&lt;&#x2F;a&gt; mentioned:&lt;&#x2F;p&gt;
&lt;blockquote&gt;
&lt;p&gt;Critically, PageFind has a Python API that lets you build indexes programmatically from database dumps rather than only from HTML files.&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;I&#x27;d completely missed that Pagefind has a &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;pagefind.app&#x2F;docs&#x2F;py-api&#x2F;&quot;&gt;Python API&lt;&#x2F;a&gt; (and a Node one too), which makes it easy to build an index from any data source.&lt;&#x2F;p&gt;
&lt;hr &#x2F;&gt;
&lt;p&gt;Here&#x27;s a basic example: building a search index for an Internet Archive collection.&lt;&#x2F;p&gt;
&lt;p&gt;I&#x27;m using the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;Pagefind&#x2F;pagefind&#x2F;releases&#x2F;tag&#x2F;v1.5.0-beta.1&quot;&gt;Pagefind pre-release&lt;&#x2F;a&gt; here, which introduces a &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;ui.pagefind.app&#x2F;&quot;&gt;new UI with web components&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Init&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;uv init .&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;uv add internetarchive&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;uv add --prerelease=allow &amp;#39;pagefind[bin]&amp;#39;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;&lt;strong&gt;Directory to save the index and serve the UI&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;mkdir .&#x2F;web&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;&lt;strong&gt;Python code:&lt;&#x2F;strong&gt; create an index from metadata of this &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;archive.org&#x2F;details&#x2F;radical-archives&quot;&gt;collection&lt;&#x2F;a&gt; (that is actually a collection of subcollections in Internet Archive, Italian content, related to radical movements)&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;python&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;import&lt;&#x2F;span&gt;&lt;span&gt; asyncio&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;import&lt;&#x2F;span&gt;&lt;span&gt; logging&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;import&lt;&#x2F;span&gt;&lt;span&gt; os&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;import&lt;&#x2F;span&gt;&lt;span&gt; internetarchive&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;from&lt;&#x2F;span&gt;&lt;span&gt; pagefind.index&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; import&lt;&#x2F;span&gt;&lt;span&gt; PagefindIndex, IndexConfig&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;logging.basicConfig(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;level&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span&gt;os.environ.get(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;LOG_LEVEL&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;DEBUG&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;))&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;log&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span&gt; logging.getLogger(__name__)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;async def&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; main&lt;&#x2F;span&gt;&lt;span&gt;():&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    config&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span&gt; IndexConfig(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;output_path&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;.&#x2F;web&#x2F;pagefind&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;    async with&lt;&#x2F;span&gt;&lt;span&gt; PagefindIndex(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;config&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span&gt;config)&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; as&lt;&#x2F;span&gt;&lt;span&gt; index:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        log.info(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;Searching collection:radical-archives ...&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        results&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span&gt; internetarchive.search_items(&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;            &amp;quot;collection:radical-archives&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;            fields&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span&gt;[&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;identifier&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;title&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;description&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        )&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        count&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 0&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;        for&lt;&#x2F;span&gt;&lt;span&gt; item&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; in&lt;&#x2F;span&gt;&lt;span&gt; results:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            identifier&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span&gt; item.get(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;identifier&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            title&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span&gt; item.get(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;title&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;, identifier)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            description&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span&gt; item.get(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;description&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            url&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt; f&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;https:&#x2F;&#x2F;archive.org&#x2F;details&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;{&lt;&#x2F;span&gt;&lt;span&gt;identifier&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;}&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            thumbnail&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt; f&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;https:&#x2F;&#x2F;archive.org&#x2F;services&#x2F;img&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;{&lt;&#x2F;span&gt;&lt;span&gt;identifier&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;}&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;            if&lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt; isinstance&lt;&#x2F;span&gt;&lt;span&gt;(description,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt; list&lt;&#x2F;span&gt;&lt;span&gt;):&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                description&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot; &amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;.join(description)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;            await&lt;&#x2F;span&gt;&lt;span&gt; index.add_custom_record(&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;                url&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span&gt;url,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;                content&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span&gt;description&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; or&lt;&#x2F;span&gt;&lt;span&gt; title,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;                language&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;en&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;                meta&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span&gt;{&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;                    &amp;quot;title&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: title,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;                    &amp;quot;description&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: description,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;                    &amp;quot;image&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: thumbnail,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                },&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            )&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            count&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; +=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            log.debug(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;indexed &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;%s&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;: &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;%s&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;, identifier, title)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        log.info(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;Indexed &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;%d&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; items. Writing index ...&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;, count)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    log.info(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;Done. Index written to .&#x2F;web&#x2F;pagefind&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;if&lt;&#x2F;span&gt;&lt;span&gt; __name__&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; ==&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;__main__&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    asyncio.run(main())&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;&lt;strong&gt;HTML UI in &lt;code&gt;.&#x2F;web&#x2F;index.html&lt;&#x2F;code&gt;&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;html&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;!&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;DOCTYPE&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; html&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;html&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; lang&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;en&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;	&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;head&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;		&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;meta&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; charset&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;UTF-8&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;		&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;meta&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; name&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;viewport&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; content&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;width=device-width, initial-scale=1.0&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;		&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;title&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;pagefind-ia&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;title&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;		&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;link&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; href&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&#x2F;pagefind&#x2F;pagefind-component-ui.css&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; rel&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;stylesheet&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;		&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;script&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; src&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&#x2F;pagefind&#x2F;pagefind-component-ui.js&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; type&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;module&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;script&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;	&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;head&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;	&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;body&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;		&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-modal-trigger&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-modal-trigger&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;		&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-modal&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;			&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-modal-header&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;				&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-input&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-input&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;			&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-modal-header&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;			&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-modal-body&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;				&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-summary&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-summary&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;				&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-results&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; show-images&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-results&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;			&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-modal-body&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;			&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-modal-footer&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;				&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-keyboard-hints&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-keyboard-hints&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;			&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-modal-footer&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;		&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;pagefind-modal&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;	&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;body&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;html&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;&lt;strong&gt;Result&lt;&#x2F;strong&gt;: easy to embed it anywhere!&lt;&#x2F;p&gt;
&lt;link href=&quot;&#x2F;pagefind&#x2F;pagefind-component-ui.css&quot; rel=&quot;stylesheet&quot;&gt;
&lt;script src=&quot;&#x2F;pagefind&#x2F;pagefind-component-ui.js&quot; type=&quot;module&quot;&gt;&lt;&#x2F;script&gt;

&lt;pagefind-modal-trigger&gt;&lt;&#x2F;pagefind-modal-trigger&gt;
&lt;pagefind-modal&gt;
	&lt;pagefind-modal-header&gt;
		&lt;pagefind-input&gt;&lt;&#x2F;pagefind-input&gt;
	&lt;&#x2F;pagefind-modal-header&gt;
	&lt;pagefind-modal-body&gt;
		&lt;pagefind-summary&gt;&lt;&#x2F;pagefind-summary&gt;
		&lt;pagefind-results show-images&gt;&lt;&#x2F;pagefind-results&gt;
	&lt;&#x2F;pagefind-modal-body&gt;
	&lt;pagefind-modal-footer&gt;
		&lt;pagefind-keyboard-hints&gt;&lt;&#x2F;pagefind-keyboard-hints&gt;
	&lt;&#x2F;pagefind-modal-footer&gt;
&lt;&#x2F;pagefind-modal&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>SZI: Tiled Images from ZIP Archives</title> <published>2025-10-27T00:00:00+00:00</published>
                <updated>2025-10-27T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/szi/" />
                <link rel="alternate"
            href="https://literarymachin.es/szi/" type="text/html" />
                <id>https://literarymachin.es/szi/</id> <content
            type="html">&lt;p&gt;I&#x27;ve always been looking for simple solutions to serve digitized documents in static format, where the cost of maintaining an IIIF image server is prohibitive and I need something simpler to manage and preserve.&lt;&#x2F;p&gt;
&lt;p&gt;IIIF &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif.io&#x2F;api&#x2F;image&#x2F;3.0&#x2F;compliance&#x2F;&quot;&gt;Level 0&lt;&#x2F;a&gt; is an option for serving static images in IIIF manifests without an image server. However, for large images this approach is inefficient, so tiling is required to serve partial images on demand.&lt;br &#x2F;&gt;
There are ongoing discussions and experiments exploring how to bring static tiles to IIIF viewers, addressing a particular need: serving tiles from ZIP files, which offer significant advantages for management, portability, and storage. Reading remote ZIP content over HTTP using Range requests is now a standard practice, popularized by &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;specs.webrecorder.net&#x2F;wacz&#x2F;1.1.1&#x2F;&quot;&gt;WACZ&lt;&#x2F;a&gt; for serving web archives.&lt;&#x2F;p&gt;
&lt;p&gt;This &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;orgs&#x2F;IIIF-Commons&#x2F;discussions&#x2F;4&quot;&gt;GitHub discussion on IIIF Commons&lt;&#x2F;a&gt; offers potential solutions for ZIP file-based tile delivery.&lt;br &#x2F;&gt;
I already conducted an &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;orgs&#x2F;IIIF-Commons&#x2F;discussions&#x2F;4#discussioncomment-6541478&quot;&gt;experiment&lt;&#x2F;a&gt; using &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;pearcetm&#x2F;GeoTIFFTileSource&quot;&gt;GeoTIFFTileSource&lt;&#x2F;a&gt; with OpenSeadragon to access remote tiled TIFF files. Here is an &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;pub-0f1c9e6ddb92456a85802303778fa724.r2.dev&#x2F;demo&#x2F;index.html&quot;&gt;example&lt;&#x2F;a&gt; hosted on Cloudflare R2, featuring a 600MB TIFF file converted with VIPS.&lt;&#x2F;p&gt;
&lt;p&gt;Today I came across the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;smartinmedia&#x2F;SZI-Format&quot;&gt;SZI Format&lt;&#x2F;a&gt; and the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;sundogbio&#x2F;szi-tile-source&quot;&gt;SZI Tile Source&lt;&#x2F;a&gt; for OpenSeadragon.&lt;br &#x2F;&gt;
Although not IIIF-based, this solution allows reading a remote &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Deep_Zoom&quot;&gt;DeepZoom&lt;&#x2F;a&gt; .dzi file packaged in a ZIP file.&lt;&#x2F;p&gt;
&lt;p&gt;Let&#x27;s test this approach by using a PDF file as source and converting its pages to tiled images.&lt;&#x2F;p&gt;
&lt;p&gt;To extract pages from a PDF, I use &lt;strong&gt;mutool&lt;&#x2F;strong&gt; from &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;ArtifexSoftware&#x2F;mupdf&quot;&gt;MuPDF&lt;&#x2F;a&gt;, but many similar tools exist, such as &lt;strong&gt;pdftoppm&lt;&#x2F;strong&gt; from &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;poppler.freedesktop.org&#x2F;&quot;&gt;Poppler&lt;&#x2F;a&gt;:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;mutool draw -r 300 -o img-%d.png file.pdf&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Next, use &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.libvips.org&#x2F;&quot;&gt;VIPS&lt;&#x2F;a&gt; to &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.libvips.org&#x2F;API&#x2F;current&#x2F;type_func.Image.arrayjoin.html&quot;&gt;assemble all images into a grid&lt;&#x2F;a&gt; and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.libvips.org&#x2F;API&#x2F;current&#x2F;method.Image.dzsave.html&quot;&gt;convert to DeepZoom&lt;&#x2F;a&gt; format:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;IMG=$(ls *.png)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;vips arrayjoin &amp;quot;$IMG&amp;quot; file.dz --across 8 --background &amp;quot;255, 255, 255&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Vips already generates a ZIP file &lt;code&gt;file.dz&lt;&#x2F;code&gt;, that only needs to be renamed as &lt;code&gt;file.szi&lt;&#x2F;code&gt; and served with a simple HTML like this one:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;html&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;!&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;DOCTYPE&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; html&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;html&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; lang&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;en&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;head&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;meta&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; charset&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;UTF-8&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;title&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;SZI Tile Source Demo&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;title&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;script&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; src&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;https:&#x2F;&#x2F;cdnjs.cloudflare.com&#x2F;ajax&#x2F;libs&#x2F;openseadragon&#x2F;5.0.1&#x2F;openseadragon.js&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;script&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;script&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; src&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;.&#x2F;dist&#x2F;szi-tile-source-v0.6.1.umd.cjs&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;script&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;style&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;    html&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; body&lt;&#x2F;span&gt;&lt;span&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      margin&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 0&lt;&#x2F;span&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      padding&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 30&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;px&lt;&#x2F;span&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      width&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 100&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;%&lt;&#x2F;span&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      height&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 100&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;%&lt;&#x2F;span&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      overflow&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt; hidden&lt;&#x2F;span&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    }&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;    #osd-szi&lt;&#x2F;span&gt;&lt;span&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      width&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 100&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;vw&lt;&#x2F;span&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      height&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 100&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;vh&lt;&#x2F;span&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    }&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;style&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;head&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;body&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;div&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; id&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;osd-szi&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;div&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;script&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;    const&lt;&#x2F;span&gt;&lt;span&gt; sziUrl&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;file.szi&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    OpenSeadragon.SziTileSource.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;createSziTileSource&lt;&#x2F;span&gt;&lt;span&gt;(sziUrl).&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;then&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;async&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt; tileSource&lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt; =&amp;gt;&lt;&#x2F;span&gt;&lt;span&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      const&lt;&#x2F;span&gt;&lt;span&gt; viewer&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; = new&lt;&#x2F;span&gt;&lt;span&gt; OpenSeadragon.&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;Viewer&lt;&#x2F;span&gt;&lt;span&gt;({&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        id:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;osd-szi&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        prefixUrl:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;https:&#x2F;&#x2F;cdnjs.cloudflare.com&#x2F;ajax&#x2F;libs&#x2F;openseadragon&#x2F;5.0.1&#x2F;images&#x2F;&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        tileSources: [tileSource],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;      });&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    })&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;script&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;body&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;html&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;This is a sample result — a &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;archive.org&#x2F;details&#x2F;radsoft-0105&#x2F;page&#x2F;n15&#x2F;mode&#x2F;2up&quot;&gt;scanned book&lt;&#x2F;a&gt; served as a 200M .szi file:
&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;pub-0f1c9e6ddb92456a85802303778fa724.r2.dev&#x2F;szi&#x2F;index.html&quot;&gt;https:&#x2F;&#x2F;pub-0f1c9e6ddb92456a85802303778fa724.r2.dev&#x2F;szi&#x2F;index.html&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;iframe src=&quot;https:&#x2F;&#x2F;pub-0f1c9e6ddb92456a85802303778fa724.r2.dev&#x2F;szi&#x2F;index.html&quot; allowfullscreen style=&quot;width: 100%; height: 900px&quot;&gt;&lt;&#x2F;iframe&gt;</content>
    </entry> <entry xml:lang="en">
        <title>The missing feature in digital libraries: searchable tables of contents</title> <published>2025-10-12T00:00:00+00:00</published>
                <updated>2025-10-12T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/ebook-toc/" />
                <link rel="alternate"
            href="https://literarymachin.es/ebook-toc/" type="text/html" />
                <id>https://literarymachin.es/ebook-toc/</id> <content
            type="html">&lt;p&gt;In the context of electronic books, I&#x27;ve always been frustrated by how reading applications relegate navigation of table of contents to a minor feature in their UI&#x2F;UX.&lt;&#x2F;p&gt;
&lt;p&gt;(Note: Throughout history, &lt;em&gt;indexes&lt;&#x2F;em&gt; — those alphabetical listings at the back of books — have been crucial for knowledge access, as Dennis Duncan explores in &quot;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Index,_A_History_of_the&quot;&gt;Index, A History of the&lt;&#x2F;a&gt;&quot;. But this post focuses on tables of contents, which show the hierarchical structure of chapters and sections.)&lt;&#x2F;p&gt;
&lt;p&gt;I find it very useful to view the table of contents before opening a book. I often do this in the terminal. You can easily create your own script in any programming language using an existing library for EPUB files (or PDF, or whatever the format you need to read). For EPUB files, the simplest approach I have found is to use &lt;strong&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;readium&#x2F;cli&quot;&gt;Readium CLI&lt;&#x2F;a&gt;&lt;&#x2F;strong&gt; and &lt;strong&gt;jq&lt;&#x2F;strong&gt; to print a tree-like structure of the book. This is the script I use:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #88846F;&quot;&gt;#!&#x2F;bin&#x2F;bash&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #88846F;&quot;&gt;# Usage: .&#x2F;epub-toc.sh &amp;lt;epub-file&amp;gt;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;if&lt;&#x2F;span&gt;&lt;span&gt; [&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FD971F;&quot;&gt; $#&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; -eq&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 0&lt;&#x2F;span&gt;&lt;span&gt; ];&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; then&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;    echo&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;Usage: &lt;&#x2F;span&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;$0&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;lt;epub-file&amp;gt;&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;gt;&amp;amp;2&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;    exit&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;fi&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;EPUB_FILE&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;$1&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;if&lt;&#x2F;span&gt;&lt;span&gt; [&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; ! -f&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;$EPUB_FILE&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt; ];&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; then&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;    echo&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;Error: File &amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;$EPUB_FILE&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39; not found&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;gt;&amp;amp;2&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;    exit&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;fi&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;if !&lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt; command&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -v&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; readium&lt;&#x2F;span&gt;&lt;span&gt; &amp;amp;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span&gt; &#x2F;dev&#x2F;null;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; then&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;    echo&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;Error: &amp;#39;readium&amp;#39; command not found. Please install readium-cli.&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;gt;&amp;amp;2&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;    exit&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;fi&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;if !&lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt; command&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -v&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; jq&lt;&#x2F;span&gt;&lt;span&gt; &amp;amp;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span&gt; &#x2F;dev&#x2F;null;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; then&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;    echo&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;Error: &amp;#39;jq&amp;#39; command not found. Please install jq.&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;gt;&amp;amp;2&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;    exit&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;fi&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;readium&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; manifest &amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;$EPUB_FILE&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; |&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; jq&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -r&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;  def tree($items; $prefix):&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    $items | to_entries[] |&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    (if .key == (($items | length) - 1) then&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;      $prefix + &amp;quot;└── &amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    else&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;      $prefix + &amp;quot;├── &amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    end) + .value.title,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    (if .value.children then&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;      tree(.value.children; $prefix + (if .key == (($items | length) - 1) then &amp;quot;    &amp;quot; else &amp;quot;│   &amp;quot; end))&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    else&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;      empty&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    end);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;  if .toc then&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    tree(.toc; &amp;quot;&amp;quot;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;  else&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    &amp;quot;Error: No .toc field found in manifest&amp;quot; | halt_error(1)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;  end&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;details&gt;
&lt;summary&gt;
Example of a book with a long and nested table of contents
&lt;&#x2F;summary&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ readium-toc La_comunicazione_imperfetta_-_Peppino_Ortoleva_Gabriele_Balbi.epub&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├── Copertina&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├── Frontespizio&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├── LA COMUNICAZIONE IMPERFETTA&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├── Introduzione&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   ├── 1. I percorsi movimentati, e accidentati, del comunicare.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   ├── 2. Teorie lineari della comunicazione: una breve archeologia.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   ├── 3. Oltre la linearità, verso l’imperfezione.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   └── 4. La struttura del libro.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├── Parte prima. Una mappa&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   ├── I. Malintesi&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 1. Capirsi male. Un’introduzione al tema.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 2. Una prima definizione, anzi due.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 3. A chi si deve il malinteso.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 4. Il gioco dei ruoli.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 5. L’andamento del malinteso.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 6. Le cause del malinteso.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 6.1. Errori e deformazioni materiali.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 6.2. Parlare lingue diverse.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 6.3. La comunicazione non verbale: toni, espressioni, gesti.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 6.4. La comunicazione verbale: l’inevitabile ambiguità del parlare.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 6.5. Detto e non detto.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   └── 6.6. Sovra-interpretare.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 7. Le conseguenze: il disagio e l’ostilità.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 8. La spirale del non capirsi.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 9. Uscire dal malinteso.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   └── 10. Il ruolo del malinteso nella comunicazione umana.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   ├── II. Malfunzionamenti&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 1. Malfunzionamenti involontari.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 2. Malfunzionamenti intenzionali.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 3. (In)tollerabilità del malfunzionamento.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 4. Contrastare il malfunzionamento: manutenzione e riparazione.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 5. Produttività del malfunzionamento.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   └── 6. Relativizzare il malfunzionamento: per una conclusione.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   ├── III. Scarsità e sovrabbondanza&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 1. Il peso della quantità.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 1.1. La scarsità informativa: effetti negativi e produttivi.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 1.2. La sovrabbondanza informativa: effetti negativi e produttivi.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   └── 1.3. Qualche principio generale.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 2. Politiche della scarsità e politiche dell’abbondanza.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 2.1. Accesso all’informazione, accesso al potere.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   └── 2.2. Controllare la circolazione dell’informazione: limitare o sommergere.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 3. Scarsità e abbondanza nell’economia della comunicazione.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 3.1. Il valore dell’informazione tra domanda e offerta.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 3.2. L’economia dell’attenzione.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   └── 3.3. I padroni della quantità.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   ├── 4. Le basi tecnologiche della scarsità e della sovrabbondanza.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   ├── 4.1. Effluvio comunicativo e scarsità materiale.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   │   └── 4.2. Scarsità e abbondanze oggettive o create ad arte.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │   └── 5. Gestire il troppo e il troppo poco.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │       ├── 5.1. Il troppo stroppia o melius est abundare quam deficere?&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │       ├── 5.2. Colmare un ambiente povero di informazioni.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   │       └── 5.3. Due concetti relativi.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   └── IV. Silenzi&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       ├── 1. La comunicazione zero.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   ├── 1.1. La presenza dell’assenza.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   ├── 1.2. I silenzi comunicano.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   └── 1.3. Silenzi codificati e silenzi enigmatici.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       ├── 2. Il silenzio del mittente.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       ├── 3. Il silenzio del ricevente.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       ├── 4. Il silenzio dei pubblici.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       ├── 5. Silenzi parziali: le omissioni.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       ├── 6. Il valore del silenzio: il segreto.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   ├── 6.1. Una breve tipologia dei segreti.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   ├── 6.2. Preservare e carpire i segreti.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   └── 6.3. Ancora sulla fragilità del segreto.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       └── 7. I paradossi del silenzio.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├── Parte seconda. Verso una teoria&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   └── V. La comunicazione è imperfetta&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       ├── 1. L’imperfezione inevitabile.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       ├── 2. Correggere, rimediare.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   ├── 2.1. Prima dell’invio: le correzioni umane, e non.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   ├── 2.2. Durante l’invio.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   └── 2.3. E quando la comunicazione ha già raggiunto il destinatario o l’arena pubblica?&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       ├── 3. Le vie dell’adattamento.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   ├── 3.1. Avere tempo.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   ├── 3.2. Adattarsi e adattare a sé.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       │   └── 3.3. Tra le persone, con gli strumenti.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│       └── 4. Dal lineare al non lineare e all’imperfetto.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├── Bibliografia&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├── Il libro&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├── Gli autori&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;└── Copyright&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;&#x2F;details&gt;
&lt;p&gt;The &lt;code&gt;readium manifest&lt;&#x2F;code&gt; command prints a &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=kSOETphAd4U&amp;amp;t=100s&quot;&gt;unified representation of a publication&lt;&#x2F;a&gt;, in JSON format.&lt;&#x2F;p&gt;
&lt;p&gt;But this is just a hacky trick, nothing more.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;The serious discussion I&#x27;d like to engage in — though I&#x27;m not sure where or which community would be best for this — is whether online book catalogs, from both stores and public libraries, publish their books&#x27; table of contents and whether those are searchable.&lt;&#x2F;strong&gt;
Is this a technical limitation, a licensing restriction from publishers, or simply an overlooked feature? Being able to search within tables of contents would significantly improve book discovery and research workflows.&lt;&#x2F;p&gt;
&lt;p&gt;Here are some examples I know so far:&lt;&#x2F;p&gt;
&lt;h3 id=&quot;digitocs-by-university-of-bologna&quot;&gt;Digitocs by University of Bologna&lt;&#x2F;h3&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;sba.unibo.it&#x2F;it&#x2F;almadl&#x2F;servizi-almadl&#x2F;digitocs-arricchire-l-opac-con-indici-e-sommari&quot;&gt;DigiTocs&lt;&#x2F;a&gt; is a service launched by the University of Bologna in 2009 that provides online access to indexes, tables of contents, and supplementary pages from books cataloged in their library system.
The service works through a distributed network of participating university libraries, each responsible for digitizing and uploading pages along with OCR-generated text and metadata. The platform is integrated with the library&#x27;s OPAC (online catalog), allowing users to view and search digitized indexes and tables of contents directly from catalog records (&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;sol.unibo.it&#x2F;SebinaOpac&#x2F;resource&#x2F;storia-dellindice-il-vaticano-e-i-libri-proibiti&#x2F;UBO02296603?locale=eng&quot;&gt;example book&lt;&#x2F;a&gt; and its &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;digitocs.unibo.it&#x2F;orti.php?id=BID_2296603&quot;&gt;TOC&lt;&#x2F;a&gt;).&lt;&#x2F;p&gt;
&lt;h3 id=&quot;neural-archive&quot;&gt;Neural Archive&lt;&#x2F;h3&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;archive.neural.it&quot;&gt;Neural Archive&lt;&#x2F;a&gt; is the online catalog of the library maintained by &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Neural_(magazine)&quot;&gt;Neural Magazine&lt;&#x2F;a&gt;. For each book they review, they publish high-quality cover images, minimal metadata, and the book&#x27;s TOC.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;out-of-context&quot;&gt;Out of context&lt;&#x2F;h1&gt;
&lt;p&gt;Read J.G. Ballard&#x27;s short story, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;20080504012305&#x2F;http:&#x2F;&#x2F;www.ballardian.com&#x2F;indexed-out-of-existence&quot;&gt;The Index&lt;&#x2F;a&gt; (1977)&lt;&#x2F;p&gt;
&lt;hr &#x2F;&gt;
&lt;p&gt;This blog has no comments or webmentions, so let&#x27;s continue the discussion on the fediverse, I am &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;digipres.club&#x2F;@raffaele&quot;&gt;@raffaele@digipres.club&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>My linkblog on fediverse</title> <published>2024-12-23T00:00:00+00:00</published>
                <updated>2024-12-23T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/linkblog-fediverse/" />
                <link rel="alternate"
            href="https://literarymachin.es/linkblog-fediverse/" type="text/html" />
                <id>https://literarymachin.es/linkblog-fediverse/</id> <content
            type="html">&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Hyperlink&quot;&gt;Hyperlinks&lt;&#x2F;a&gt; are the essence of the web. They enable content discovery,
allowing users to navigate between diverse sources of information with different interfaces, graphics, and technologies.
Using links is straightforward - you just need to click or tap on them.
It&#x27;s also easy to create new links on the web, you just need to follow some basic rules and conventions.&lt;&#x2F;p&gt;
&lt;p&gt;Lately, there has been a renaissance of &lt;strong&gt;linkblogs&lt;&#x2F;strong&gt;, blogs focused on sharing curated links.
Some notable examples of linkblogs I follow: &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;simonwillison.net&#x2F;atom&#x2F;links&#x2F;&quot;&gt;Simon Willison&#x27;s Links&lt;&#x2F;a&gt;,
&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.somebits.com&#x2F;linkblog&#x2F;&quot;&gt;Nelson&#x27;s Linkblog&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;laughingmeme.org&#x2F;links&#x2F;&quot;&gt;Kellan&#x27;s Linkblog&lt;&#x2F;a&gt;.
&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.thingsmagazine.net&#x2F;&quot;&gt;Things Magazine&lt;&#x2F;a&gt; is also a linkblog, as is the wonderful Italian newsletter
&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;linkmoltobelli.substack.com&#x2F;&quot;&gt;Link Molto Belli == Very Beautiful Links&lt;&#x2F;a&gt;.
My RSS reader follows also many accounts from Pinboard, they are technically personal bookmarks, but I consider them equivalent to linkblogs too.&lt;&#x2F;p&gt;
&lt;p&gt;I&#x27;ve been thinking about creating my own linkblog for a while now.
I browse the web and save many bookmarks, some are private, but most can be public.
They reflect a curated filter of web content about topics related to my interests
(digital libraries and archives, web archiving, books, mountains, and obscure music).&lt;&#x2F;p&gt;
&lt;p&gt;I discarded the idea of using any blogging service, I prefer to self-host my content (like &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;literarymachin.es&quot;&gt;this blog&lt;&#x2F;a&gt;).
I could have used a static site generator - there are dozens of them - but I always struggle to find one as simple as I want.
Furthermore, since a linkblog is related to content I browse, I want something with less friction than creating a
markdown file, pushing to a repo, and waiting for the build.
I want an admin interface where I can post quickly, without leaving the browser, maybe with the help of a bookmarklet, a browser extension,
or a Tampermonkey script.&lt;&#x2F;p&gt;
&lt;p&gt;I have also evaluated Pocketbase, which is a very nice application platform. You can easily create a data model (with migrations), the UI is minimalistic and beautiful, and you can easily plug in code (this was my first attempt that just publishes an RSS feed, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;linkbase&quot;&gt;linkbase&lt;&#x2F;a&gt;). It&#x27;s very easy and powerful, but some things are missing: an HTML interface (which could be quickly done with &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;templ.guide&quot;&gt;templ&lt;&#x2F;a&gt;), but more importantly, Fediverse integration. Because yes, for a linkblog an RSS&#x2F;Atom feed is mandatory, but these days ActivityPub is also a good way to publish content and reach readers.&lt;&#x2F;p&gt;
&lt;p&gt;A full Mastodon instance is overkill, considering the resources and maintenance required. I want something simpler.
Here comes &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;codeberg.org&#x2F;grunfink&#x2F;snac2&quot;&gt;Snac&lt;&#x2F;a&gt;, a simple, minimalistic ActivityPub instance written in portable C. A database is not needed, the data is stored in json files in the filesystem, dependencies are minimal, and there is no Javascript.&lt;&#x2F;p&gt;
&lt;p&gt;I first heard of Snac from &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;mastodon.bsd.cafe&#x2F;@stefano&quot;&gt;Stefano Marinelli&lt;&#x2F;a&gt;, who is a lovely source of news from the BSD world, selfhosting, networking and everything related to Unix philosophy. Then from Giacomo Tesio and this good post &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;encrypted.tesio.it&#x2F;2024&#x2F;12&#x2F;18&#x2F;how-to-run-your-own-social-network.html&quot;&gt;How to run your own social network (with Snac)&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;So, this is my linkblog on fediverse, made with Snac: &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;href.literarymachin.es&#x2F;raffaele&quot;&gt;https:&#x2F;&#x2F;href.literarymachin.es&#x2F;raffaele&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;This is how I have installed it. I prefer a containerized deploy, but a static binary build and a systemd service are enough and maybe even more simpler to deploy it.&lt;&#x2F;p&gt;
&lt;p&gt;I build the image on my laptop:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;git clone https:&#x2F;&#x2F;codeberg.org&#x2F;grunfink&#x2F;snac2.git&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;cd snac2&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;docker build -t snac .&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Then I transfer the image on the remote server (it&#x27;s a 12MB image, I don&#x27;t need a registry!):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;docker image save snac | ssh {REMOTE_SERVER} docker load&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;I run it with this Docker compose:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;services:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    href:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        image: snac&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        restart: always&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        security_opt:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            - no-new-privileges:true&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        volumes:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            - .&#x2F;data:&#x2F;data&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        ports:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            - &amp;quot;8001:8001&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;mkdir data&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;docker compose up -d&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;The basic configuration needed is changing the hostname:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;cat data&#x2F;data&#x2F;server.json | jq .host&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;quot;href.literarymachin.es&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Then I create my user&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;docker compose exec href snac adduser &#x2F;data&#x2F;data raffaele&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;And finally, I have configured a nginx proxy like this &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;codeberg.org&#x2F;grunfink&#x2F;snac2&#x2F;src&#x2F;branch&#x2F;master&#x2F;examples&#x2F;nginx-alpine-ssl&#x2F;default.conf&quot;&gt;example&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;href.literarymachin.es&#x2F;raffaele&quot;&gt;Follow my linkblog&lt;&#x2F;a&gt;, and suggest more linkblogs to follow!&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Gli oggetti digitali del catalogo SBN</title> <published>2024-11-01T00:00:00+00:00</published>
                <updated>2024-11-01T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/sbn-digital/" />
                <link rel="alternate"
            href="https://literarymachin.es/sbn-digital/" type="text/html" />
                <id>https://literarymachin.es/sbn-digital/</id> <content
            type="html">&lt;p&gt;Ho recentemente scoperto la disponibilità delle &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.iccu.sbn.it&#x2F;it&#x2F;attivita-servizi&#x2F;dati-aperti&#x2F;&quot;&gt;API del catalogo SBN&lt;&#x2F;a&gt;, sebbene non sappia da quanto tempo siano state rilasciate. È un argomento di cui mi sono interessato in passato, più per curiosità personale che per necessità professionale, credendo molto nel valore di dati e metadati aperti nel settore dei beni culturali. Anni fa avevo individuato l&#x27;esistenza di alcune &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;sbn-json-api&#x2F;&quot;&gt;API non ufficiali&lt;&#x2F;a&gt; utilizzate dalle applicazioni mobili del catalogo, che ancora oggi funzionano, seppur con funzionalità limitate. Queste API continuano a suscitare interesse in ricercatori o sviluppatori che mi contattano per avere ulteriori dettagli, che purtroppo non sono in grado di fornire.&lt;&#x2F;p&gt;
&lt;p&gt;Quella che segue è una mia analisi di queste nuove API ufficiali del catalogo SBN, e il modo in cui le ho utilizzate per uno specifico caso di studio: &lt;strong&gt;ottenere l&#x27;elenco dei documenti per i quali è disponibile una risorsa digitale&lt;&#x2F;strong&gt;.
L&#x27;intero catalogo SBN conta &lt;strong&gt;20+ milioni di documenti&lt;&#x2F;strong&gt;. Il sottoinsieme che a me interessa, con i documenti digitalizzati, poco &lt;strong&gt;meno di un milione (938.000+)&lt;&#x2F;strong&gt;.
Ottenere la lista dei documenti di cui è disponibile l&#x27;oggetto digitale mi è sembrato un buon esperimento per esplorare il catalogo in modo casuale e scoprirne qualche contenuto rilevante (serendipità!).&lt;&#x2F;p&gt;
&lt;p&gt;Ho riscontrato alcune particolarità nella modellazione dei dati, e la mancanza di una documentazione dettagliata e completa mi ha fatto procedere a tentativi e intuizioni. Non intendo criticare o sminuire il lavoro svolto dall&#x27;ICCU, anzi credo che sia un risultato importante e spero che una maggiore discussione pubblica su questi strumenti e interfacce sui dati possa contribuire a migliorarli e incentivarne l&#x27;utilizzo.&lt;&#x2F;p&gt;
&lt;p&gt;Voglio però precisare che negli ultimi tempi ho maturato una visione diversa e meno ortodossa sul modo in cui i dati dei beni culturali dovrebbero essere distribuiti: ne ho scritto qui &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;beyond-api-data-dumps&#x2F;&quot;&gt;Beyond HTTP APIs: the case for database dumps in Cultural Heritage&lt;&#x2F;a&gt;, sostenendo che dovremmo preferire degli export completi, autonomi e pronti all&#x27;uso rispetto alle API.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;quickstart-per-usare-le-api&quot;&gt;Quickstart per usare le API&lt;&#x2F;h1&gt;
&lt;p&gt;Le API sono raggiungibili da questo portale &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;api.iccu.sbn.it&#x2F;devportal&#x2F;apis&quot;&gt;https:&#x2F;&#x2F;api.iccu.sbn.it&#x2F;devportal&#x2F;apis&lt;&#x2F;a&gt;. L&#x27;utilizzo non è pubblico e anonimo, per potere essere usate è necessario &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;api.iccu.sbn.it&#x2F;accountrecoveryendpoint&#x2F;register.do&quot;&gt;registrare un account&lt;&#x2F;a&gt; e successivamente creare delle chiavi OAuth2, che serviranno per generare un token da includere in tutte le chiamate.&lt;&#x2F;p&gt;
&lt;p&gt;Il prodotto software qui usato è &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;wso2.com&#x2F;api-manager&#x2F;&quot;&gt;WSO2 API manager&lt;&#x2F;a&gt; e da quello che ho potuto capire espone direttamente delle API di Solr (in sola lettura, ovviamente).
Esistono diverse API, divise per servizio, presentate graficamente con una sorta di tavola periodica. Non è immediatamente chiaro a cosa di riferiscono e la terminologia usata è per persone che già conoscono l&#x27;ecosistema dei servizi di SBN. A me risulta del tutto ignoto cosa siano CA (Cataloghi Storici) oppure IC (ICFE Services), e ho intuito che AB si riferisse all&#x27;Anagrafe Biblioteche.
Ma quello a cui sono interessato è &lt;strong&gt;SB&lt;&#x2F;strong&gt;, &lt;strong&gt;SBN Integrato&lt;&#x2F;strong&gt;.&lt;&#x2F;p&gt;
&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;iccuapi.jpg&quot; alt=&quot;ICCU API Portal&quot;&gt;
&lt;p&gt;Ognuna della API ha ovviamente delle chiamate e delle risposte di tipo diverso. Sono messi a disposizione degli SDKs già pronti in Java e Javascript. Per la mia attività ho preferito iniziare a scrivere una libreria in linguaggio Go: la trovate qui &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iccu&quot;&gt;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iccu&lt;&#x2F;a&gt;. Non è un SDK completo, è ancora un modulo molto spartano, e col tempo potrei completarlo.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;la-cattura-dei-documenti-con-oggetto-digitale&quot;&gt;La cattura dei documenti con oggetto digitale&lt;&#x2F;h1&gt;
&lt;p&gt;Per esplorare il catalogo ho abbandonato fin da subito l&#x27;idea di interpretare in tempo reale le risposte delle API: ho deciso di salvarmi tutti i dati in locale e poi successivamente parsarli. Ho salvato le risposte in un database &lt;strong&gt;SQLite&lt;&#x2F;strong&gt;, estremamente semplice: un field &lt;code&gt;doc&lt;&#x2F;code&gt; di tipo json in cui salvo il json raw risultante dalla api, e una colonna &lt;code&gt;bid&lt;&#x2F;code&gt; popolata automaticamente dal field unimarc 003&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;sql&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;CREATE TABLE&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; sbn&lt;&#x2F;span&gt;&lt;span&gt; (&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        bid &lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;TEXT&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; GENERATED ALWAYS AS&lt;&#x2F;span&gt;&lt;span&gt; (json_extract(doc, &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;$.unimarc.fields[1].003&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;)) VIRTUAL,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;        doc &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;json&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;CREATE INDEX&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; bid_idx&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; on&lt;&#x2F;span&gt;&lt;span&gt; sbn(bid);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;La API chiamata è la seguente: gli argomenti rilevanti sono &lt;code&gt;presenza_digitale=Y&lt;&#x2F;code&gt; e &lt;code&gt;format=full&lt;&#x2F;code&gt; (diversamente avrete un oggetto minimale non completo di tutto l&#x27;unimarc).&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;GET https:&#x2F;&#x2F;api.iccu.sbn.it&#x2F;sbn&#x2F;1.0.0&#x2F;search&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;	format=json&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;	detail=full&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;	page-size=500&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;	presenza_digitale=Y&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Questo un esempio completo di risposta di un singolo record &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;gist.github.com&#x2F;atomotic&#x2F;ab0cdb3e9e81cd07f4e76bd87013e99c&quot;&gt;RAV0302299&lt;&#x2F;a&gt; (&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;id.sbn.it&#x2F;bid&#x2F;RAV0302299&quot;&gt;http:&#x2F;&#x2F;id.sbn.it&#x2F;bid&#x2F;RAV0302299&lt;&#x2F;a&gt;).&lt;&#x2F;p&gt;
&lt;p&gt;Ho usato una paginazione abbastanza alta, 500 documenti per risposta. Aumentare il numero di documenti restituiti fa diminuire il numero di chiamate HTTP e può velocizzare tantissimo la cattura; ma c&#x27;è il problema che spesso alcuni documenti contengono errori di encoding e il JSON restituito non è valido. Quando li incontrate perderete il contenuto di quei documenti nella finestra di paginazione: è capitato anche nella mia analisi, e non ho ulteriormente indagato ne ho voluto implementare un parsing più efficiente: ho perso qualche migliaio di documenti, ed è un margine di errore accettabile.&lt;&#x2F;p&gt;
&lt;p&gt;Ne ho ottenuto un database di &lt;strong&gt;936500 righe&lt;&#x2F;strong&gt;, del peso di &lt;strong&gt;4.7G&lt;&#x2F;strong&gt;. Non distribuirò pubblicamente questo database (non ho ben chiara la licenza d&#x27;uso di questi dati), ma se qualcuno fosse interessato lo condivido.&lt;&#x2F;p&gt;
&lt;p&gt;Come nel caso di attività di scraping, anche in questo caso di utilizzo di API restano valide delle norme di buona condotta: limitare l&#x27;aggressività e la velocità delle chiamate, identificarsi sempre nello User Agent delle chiamate HTTP (anche se queste API hanno un token quindi presumo che l&#x27;origine e ogni attività sia sempre rintracciabile).&lt;&#x2F;p&gt;
&lt;p&gt;Il codice usato per la cattura è qui disponibile: &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iccu&#x2F;tree&#x2F;main&#x2F;cmd&#x2F;sbn-metadata-fetch&quot;&gt;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iccu&#x2F;cmd&#x2F;sbn-metadata-fetch&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;h1 id=&quot;l-analisi-e-l-esplorazione-dei-metadati&quot;&gt;L&#x27;analisi e l&#x27;esplorazione dei metadati&lt;&#x2F;h1&gt;
&lt;p&gt;Pensavo ingenuamente che mi sarebbero state sufficienti delle &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.sqlite.org&#x2F;json1.html&quot;&gt;query SQL nel campo JSON del database SQLite&lt;&#x2F;a&gt; per poter esplorare questi dati: purtroppo la mancanza di uno schema e la modellazione di alcuni dati rendono difficoltoso poter fare tutto in SQL, e ho dovuto scrivermi dei metodi all&#x27;oggetto Go che implementassero alcune logiche su questi dati.&lt;&#x2F;p&gt;
&lt;p&gt;Non sono interessato a TUTTI i metadati disponibili, ma solo ad un insieme ridotto, la mia necessità è ottenere i link agli oggetti digitalizzati più che i metadati.
Dalla trasformazione dei metadati di origine ho voluto ottenere degli oggetti semplificati come il seguente (sono volutamente mancanti dati come gli autori, etc).&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;json&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;{&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;bid&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\\&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;ICCU&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\\&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;VIAE&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\\&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;007373&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;id&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;http:&#x2F;&#x2F;id.sbn.it&#x2F;bid&#x2F;VIAE007373&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;idmanus&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;title&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Risposta apologetica, e critica alle osservazioni, ed alla lettera del molto reverendo padre Cantova della Compagnia di Gesu, stampate in Milano l&amp;#39;anno 1752. Contro a chi ha ultimamente difesa la necessita dell&amp;#39;amor di Dio nel sagramento della penitenza&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;iiif&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;    &amp;quot;https:&#x2F;&#x2F;jmms.iccu.sbn.it&#x2F;jmms&#x2F;metadata&#x2F;UW01alpnX18_&#x2F;b2FpOmJuY2YuZmlyZW56ZS5zYm4uaXQ6MjE6RkkwMDk4Ok1hZ2xpYWJlY2hpOlZJQUUwMDczNzM_&#x2F;manifest.json&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  ],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;link&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;    &amp;quot;http:&#x2F;&#x2F;books.google.com&#x2F;books?vid=IBSC:SC000005684&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;    &amp;quot;http:&#x2F;&#x2F;books.google.com&#x2F;books?vid=IBSC:SC000008356&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;    &amp;quot;http:&#x2F;&#x2F;teca.bncf.firenze.sbn.it&#x2F;ImageViewer&#x2F;servlet&#x2F;ImageViewer?idr=BNCF0003334533&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  ],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;type&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Testo&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;material&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;    &amp;quot;Libro antico&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  ],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;thumbnails&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;    &amp;quot;https:&#x2F;&#x2F;jmms.iccu.sbn.it&#x2F;jmms&#x2F;resource&#x2F;ad&#x2F;first&#x2F;UW01alpnX18_&#x2F;b2FpOmJuY2YuZmlyZW56ZS5zYm4uaXQ6MjE6RkkwMDk4Ok1hZ2xpYWJlY2hpOlZJQUUwMDczNzM_&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  ],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;start_date&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 1753&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;end_date&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 1753&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Lo script &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iccu&#x2F;tree&#x2F;main&#x2F;cmd&#x2F;sbn-metadata-transform&quot;&gt;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iccu&#x2F;cmd&#x2F;sbn-metadata-transform&lt;&#x2F;a&gt; estrae i dati dal db SQLite e genera un file in formato &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;jsonlines.org&#x2F;&quot;&gt;JSON Lines&lt;&#x2F;a&gt; (~500M). Questo export è così pronto per essere caricato in diversi altri strumenti più adatti all&#x27;analisi dei dati, come SOLR o DuckDB.&lt;&#x2F;p&gt;
&lt;p&gt;Ho preferito usare &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;duckdb.org&#x2F;&quot;&gt;DuckDB&lt;&#x2F;a&gt;, e questo è il modo in cui ho caricato i dati:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ duckdb sbn.duckdb &amp;quot;create table digital as select * from read_json_auto(&amp;#39;sbn.jsonl&amp;#39;);&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ duckdb sbn.duckdb&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;D .schema&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;CREATE TABLE digital(bid VARCHAR, id VARCHAR, idmanus VARCHAR, title VARCHAR, iiif VARCHAR[], link VARCHAR[], &amp;quot;type&amp;quot; VARCHAR, material VARCHAR[], thumbnails VARCHAR[], start_date BIGINT, end_date BIGINT);&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Ho esportato il database DuckDB in formato parquet e lo si può scaricare da qui &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;atomotic.github.io&#x2F;data&#x2F;sbn.digital.parquet&quot;&gt;https:&#x2F;&#x2F;atomotic.github.io&#x2F;data&#x2F;sbn.digital.parquet&lt;&#x2F;a&gt; (93M).&lt;&#x2F;p&gt;
&lt;p&gt;Il file parquet può essere usato direttamente in &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;shell.duckdb.org&quot;&gt;DuckDB shell&lt;&#x2F;a&gt; nel browser, senza installare nulla. È sufficiente creare una tabella (&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;shell.duckdb.org&#x2F;#queries=v0,CREATE-TABLE-digital-AS-FROM-&amp;#x27;https%3A%2F%2Fatomotic.github.io%2Fdata%2Fsbn.digital.parquet&amp;#x27;~,select-*-from-digital-limit-5~&quot;&gt;esempio&lt;&#x2F;a&gt;):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;CREATE TABLE digital AS FROM &amp;#39;https:&#x2F;&#x2F;atomotic.github.io&#x2F;data&#x2F;sbn.digital.parquet&amp;#39;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h1 id=&quot;alcune-query-dimostrative&quot;&gt;Alcune query dimostrative:&lt;&#x2F;h1&gt;
&lt;h3 id=&quot;numero-di-documenti-raggruppati-per-tipologia&quot;&gt;Numero di documenti raggruppati per tipologia&lt;&#x2F;h3&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;sql&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;D &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;SELECT&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;    type&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;    COUNT&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;*&lt;&#x2F;span&gt;&lt;span&gt;) &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;AS&lt;&#x2F;span&gt;&lt;span&gt; count&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;  FROM&lt;&#x2F;span&gt;&lt;span&gt; digital&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;  GROUP BY type order by&lt;&#x2F;span&gt;&lt;span&gt; count &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;desc&lt;&#x2F;span&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;┌───────────────────────────────────┬────────┐&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│               &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;type&lt;&#x2F;span&gt;&lt;span&gt;                │ count  │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│              &lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;varchar&lt;&#x2F;span&gt;&lt;span&gt;              │ int64  │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├───────────────────────────────────┼────────┤&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Testo                             │ &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;506962&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Registrazione sonora musicale     │ &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;310053&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Risorsa grafica                   │  &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;53829&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Musica manoscritta                │  &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;20721&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Testo manoscritto                 │  &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;19221&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Musica a stampa                   │  &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;11565&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Registrazione sonora non musicale │   &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;7180&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Risorsa cartografica a stampa     │   &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;4483&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Risorsa elettronica               │   &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;1965&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Risorsa cartografica manoscritta  │    &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;406&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Risorsa da proiettare o video     │     &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;72&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Oggetto tridimensionale           │     &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;29&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ Risorsa multimediale              │     &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;14&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├───────────────────────────────────┴────────┤&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;13&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; rows&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;                          2&lt;&#x2F;span&gt;&lt;span&gt; columns │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;└────────────────────────────────────────────┘&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;numero-di-manifest-iiif&quot;&gt;Numero di manifest IIIF&lt;&#x2F;h3&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;sql&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;D &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;SELECT&lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt; COUNT&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;*&lt;&#x2F;span&gt;&lt;span&gt;) &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;as&lt;&#x2F;span&gt;&lt;span&gt; manifest&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;    FROM&lt;&#x2F;span&gt;&lt;span&gt; (&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;        SELECT DISTINCT&lt;&#x2F;span&gt;&lt;span&gt; unnest(iiif)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;        FROM&lt;&#x2F;span&gt;&lt;span&gt; digital&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    );&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;┌──────────┐&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ manifest │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│  int64   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├──────────┤&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│   &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;341324&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;└──────────┘&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;numero-di-links-esterni&quot;&gt;Numero di links esterni&lt;&#x2F;h3&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;sql&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;D &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;SELECT&lt;&#x2F;span&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt; COUNT&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;*&lt;&#x2F;span&gt;&lt;span&gt;) &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;as&lt;&#x2F;span&gt;&lt;span&gt; link&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;    FROM&lt;&#x2F;span&gt;&lt;span&gt; (&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;        SELECT DISTINCT&lt;&#x2F;span&gt;&lt;span&gt; unnest(link)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;        FROM&lt;&#x2F;span&gt;&lt;span&gt; digital&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    );&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;┌─────────┐&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│  link   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│  int64  │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;├─────────┤&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;│ &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;1045225&lt;&#x2F;span&gt;&lt;span&gt; │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;└─────────┘&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;origine-dei-link-esterni&quot;&gt;Origine dei link esterni&lt;&#x2F;h3&gt;
&lt;p&gt;Riguardo ai link esterni ho voluto estrarre l&#x27;host del server e poi raggrupparli, in modo da &lt;strong&gt;indentificare la provenienza&lt;&#x2F;strong&gt;. Ho utilizzato &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;curl&#x2F;trurl&quot;&gt;trurl&lt;&#x2F;a&gt; per il parsing della URL, che mi ha rilevato anche diversi errori di parsing, ma li ho tralasciati considerandoli marginali:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; duckdb --list sbn.duckdb &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;SELECT DISTINCT TRIM(unnest(link)) AS unique_links FROM digital;&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; \&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;    |&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; trurl&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -f&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; -&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; --get&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;{host}&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; --accept-space&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; urls.txt&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Il file urls.txt contiene la lista degli host, non ordinata. Sarebbero sufficienti &lt;code&gt;sort&lt;&#x2F;code&gt;, &lt;code&gt;uniq&lt;&#x2F;code&gt; e &lt;code&gt;wc&lt;&#x2F;code&gt; per poter fare dei conteggi, ma c&#x27;è &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;timbray&#x2F;topfew&quot;&gt;topfew&lt;&#x2F;a&gt; (del noto Tim Bray!) che è molto più efficiente.
Google Books, l&#x27;Istituto Centrale dei Beni Sonori, e la Teca della BNCF sono le sorgenti predominanti.&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ topfew -n 30 urls.txt&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;363190 books.google.com&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;312041 opac2.icbsa.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;134072 teca.bncf.firenze.sbn.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;58043 www.internetculturale.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;46714 books.google.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;12614 www.braidense.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;8558 www.bibliotecamusica.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;6290 www.widejef.com&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;6091 www.bdl.servizirl.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;5020 archive.org&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;4284 www.14-18.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;4276 corago.unibo.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;3772 www.google.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;3574 sbn.comune.eboli.sa.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;3562 www.cmarchiviodigitale.com&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;3177 digiteca.bsmc.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;3103 www.polodigitalenapoli.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;2602 www.aggiornamentisociali.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;2330 hdl.handle.net&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;2304 www.proquest.com&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;2280 atena.beic.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;1879 www.fondazionecircoloartistico.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;1698 badigit.comune.bologna.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;1546 doi.org&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;1431 digital.fondazionecarisbo.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;1431 5.175.50.107&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;1311 www.omeka.unito.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;1274 www.byterfly.eu&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;1196 www.repubblicaromana-1849.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;1164 turismo.comune.sanginesio.mc.it&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Tra gli host figurano alcune cose bizzarre, molti IP e anche diversi file linkati da Google Drive (e mi sembra una pessima idea linkare in un catalogo degli oggetti da un file storage)&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ grep drive.google urls.txt | wc -l&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;467&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Ancora peggio ci sono anche diversi link a Facebook. E al tempo stesso, mi meraviglio, che non ci siano link verso Wikisource o Wikimedia Commons (ma mi riservo di indagare ulteriormente).&lt;&#x2F;p&gt;
&lt;h1 id=&quot;criticita-incontrate&quot;&gt;Criticità incontrate&lt;&#x2F;h1&gt;
&lt;p&gt;I problemi che ho incontrato non sono di natura tecnica sulle API, ma riguardano la modellazione dei metadati:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;La struttura non è uniforme. C&#x27;è un &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;gist.github.com&#x2F;atomotic&#x2F;ab0cdb3e9e81cd07f4e76bd87013e99c#file-rav0302299-json-L42-L1365&quot;&gt;oggetto unimarc&lt;&#x2F;a&gt; che è una rappresentazione in json dell&#x27;xml unimarc (non è comodissimo da parsare ma va bene così), mentre invece ci sono una serie di campi accessori al di fuori di quell&#x27;oggetto (come ad esempio i manifest IIIF) oppure altri dati che duplicano informazioni già contenute nell&#x27;unimarc. Sospetto che siano dati presenti lì per facilitarne l&#x27;accesso. Penso che sia comunque normale per una base dati longeva come SBN dovere essere costretti ad aggiungere al bisogno dei campi accessori.&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;Alcuni valori non sono completi: ad esempio i &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;gist.github.com&#x2F;atomotic&#x2F;ab0cdb3e9e81cd07f4e76bd87013e99c#file-rav0302299-json-L10-L18&quot;&gt;manifest IIIF riportano solo il path, e manca sempre l&#x27;host&lt;&#x2F;a&gt;. Con qualche euristica sono riuscito a ricavarlo, ma sarebbe bene che i valori fossero sempre completi. Altre volte invece ho notato che alcuni campi contengono valori multipli divisi con qualche carattere separatore: è il caso dei link esterni alcune volte divisi da &lt;code&gt;&quot; | &quot;&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;Locazione dell&#x27;oggetto digitale. Ho capito che possono essere di due tipi: manifest IIIF, che vengono anche visualizzati con un viewer direttamente nel catalogo web, oppure sono dei collegamenti a pagine esterne (ma possono esserci entrambi manifest e link).
I manifest sono riportati con dei &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;gist.github.com&#x2F;atomotic&#x2F;ab0cdb3e9e81cd07f4e76bd87013e99c#file-rav0302299-json-L10-L18&quot;&gt;field nel livello principale dell&#x27;oggetto&lt;&#x2F;a&gt;: esistono &lt;strong&gt;dig_cover&lt;&#x2F;strong&gt;, &lt;strong&gt;dig_manifest&lt;&#x2F;strong&gt;, &lt;strong&gt;dig_preview&lt;&#x2F;strong&gt; e &lt;strong&gt;dig_preview_URL&lt;&#x2F;strong&gt;, e non sempre mi è chiara la ridondanza. I link esterni invece sono riportati nell&#x27;oggetto &lt;code&gt;unimarc&lt;&#x2F;code&gt; in &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;gist.github.com&#x2F;atomotic&#x2F;ab0cdb3e9e81cd07f4e76bd87013e99c#file-rav0302299-json-L663-L686&quot;&gt;899.u&lt;&#x2F;a&gt; o altri.&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;Alcuni vocabolari fanno uso di lettere singole (ad esempio nel campo tipologie e materiale). Questi vocabolari sono scarsamente documentati, in questi casi sarebbe bene usare una URI (risolvibile!) che porti ad una pagina di documentazione.
Esempio:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;Codice a un carattere del tipo documento: a=Testo b=Testo manoscritto c=Musica a stampa d=Musica manoscritta e=Risorsa cartografica a stampa f=Risorsa cartografica manoscritta g=Risorsa da proiettare o video i=Registrazione sonora non musicale j=Registrazione sonora musicale k=Risorsa grafica l=Risorsa elettronica r=Oggetto tridimensionale m=Risorsa multimediale&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;---&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;codice ad un carattere del tipo materiale: v=Audiovisivi c=Cartografia g=Grafica A=Libro antico N=Libro moderno M=Musica&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Manca uno schema&lt;&#x2F;strong&gt;: questo è il maggiore dei problemi. Ho dovuto procedere a tentativi ed euristiche per potere parsare quelle risposte, e sono certo di non avere individuato tutte le possibili casistiche o possibilità di errori. I metadati hanno bisogno obbligatoriamente di schemi, con i quali poter effettuare validazioni e costraint. Di possibili tecnologie ne esistono diverse, di complessità variabile: &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;json-schema.org&#x2F;&quot;&gt;JSONSchema&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;avro.apache.org&#x2F;&quot;&gt;Avro&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;protobuf.dev&#x2F;&quot;&gt;Protobuf&lt;&#x2F;a&gt;. Penso sia sufficiente un buon JSONSchema per iniziare. Esistono anche alcune cose nuove come &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;pkl-lang.org&#x2F;index.html&quot;&gt;PKL&lt;&#x2F;a&gt; o &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;cuelang.org&#x2F;&quot;&gt;CUE&lt;&#x2F;a&gt;, finora mai impiegate in un ambito di serializzazione di metadati, che secondo me sono interessanti e il mondo delle digital libraries potrebbe iniziare a valutarle.&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;h1 id=&quot;conclusioni&quot;&gt;Conclusioni&lt;&#x2F;h1&gt;
&lt;p&gt;Al netto dei problemi di modellazione dei dati mi sembra che l&#x27;infrastruttura tecnologica di questo prodotto di API sia altamente funzionante. Mi piacerebbe sapere se esistono delle statistiche di utilizzo o reali di esempi di integrazione su cataloghi o portali esterni.
Penso poi che il mondo Wikidata, dove già esistono diverse integrazioni con il catalogo SBN, possa trarre beneficio da queste API e rendere più veloci e automatici diversi processi già esistenti.&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Beyond HTTP APIs: the case for database dumps in Cultural Heritage</title> <published>2024-07-15T00:00:00+00:00</published>
                <updated>2024-07-15T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/beyond-api-data-dumps/" />
                <link rel="alternate"
            href="https://literarymachin.es/beyond-api-data-dumps/" type="text/html" />
                <id>https://literarymachin.es/beyond-api-data-dumps/</id> <summary type="html">&lt;p&gt;In the realm of cultural heritage, we&#x27;re not just developing websites; we&#x27;re creating data platforms. One of the primary missions of cultural institutions is to make data (both metadata and digital content) freely available on the web. This data should come with appropriate usage licenses and in suitable formats to facilitate interoperability and content sharing.&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;In the realm of cultural heritage, we&#x27;re not just developing websites; we&#x27;re creating data platforms. One of the primary missions of cultural institutions is to make data (both metadata and digital content) freely available on the web. This data should come with appropriate usage licenses and in suitable formats to facilitate interoperability and content sharing.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;However, there&#x27;s no universal approach to this task, and it&#x27;s far from simple. We face several challenges:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Licensing issues&lt;&#x2F;strong&gt;: Licenses are often missing, incomplete, or inadequate (I won&#x27;t discuss licenses in this post).&lt;&#x2F;li&gt;
&lt;li&gt;&lt;strong&gt;Data format standardization&lt;&#x2F;strong&gt;: We continue to struggle with finding universally accepted standards, formats, and protocols.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;The current landscape of data distribution methods is varied:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Data dumps in various formats (CSV, XML, JSON, XLS 💀, TXT)&lt;&#x2F;li&gt;
&lt;li&gt;HTTP APIs (XML, JSON, RESTful, GraphQL, etc.)&lt;&#x2F;li&gt;
&lt;li&gt;Data packaging solutions (&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;BagIt&quot;&gt;Bagit&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;ocfl.io&#x2F;&quot;&gt;OCFL&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;specs.frictionlessdata.io&#x2F;&quot;&gt;Data Package&lt;&#x2F;a&gt;)&lt;&#x2F;li&gt;
&lt;li&gt;Linked Open Data (LOD)&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;While we often advocate for the adoption of Linked Open Data, which theoretically should be the ultimate solution, my experience suggests that the effort required to maintain, understand, and use LOD often renders it impractical and &lt;strong&gt;unusable&lt;&#x2F;strong&gt; (Wikidata is an exception).&lt;&#x2F;p&gt;
&lt;p align=&quot;center&quot;&gt;
  &lt;a href=&quot;https:&#x2F;&#x2F;x.com&#x2F;azaroth42&#x2F;status&#x2F;804132088450465792&quot;&gt;
    &lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;robsanderson.jpg&quot; alt=&quot;Tweet by Rob Sanderson about LOD&quot;&gt;
  &lt;&#x2F;a&gt;
&lt;&#x2F;p&gt;
&lt;h2 id=&quot;seeking-simpler-solutions&quot;&gt;Seeking Simpler Solutions&lt;&#x2F;h2&gt;
&lt;p&gt;Can we approach data distribution with simpler solutions? As always, it depends: if there is a need for real-time access to authoritative data, HTTP APIs are necessary (or streaming data solutions like Kafka and similar technologies).
&lt;strong&gt;But if we can work with a copy of the data, not necessarily updated in real time, then using self-contained and self-describing formats is a viable solution.&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;p&gt;I&#x27;ve always thought that the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.isfdb.org&quot;&gt;Internet Speculative Fiction Database&lt;&#x2F;a&gt; (ISFDB) publishing of &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.isfdb.org&#x2F;wiki&#x2F;index.php&#x2F;ISFDB_Downloads#Database_Backups&quot;&gt;MySQL dumps&lt;&#x2F;a&gt; was a smart move. Here&#x27;s why:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;An SQL dump includes implicit schema definitions along with the data content.&lt;&#x2F;li&gt;
&lt;li&gt;While not directly usable, loading it into a working MySQL server is straightforward. You can easily set one up on your laptop.&lt;&#x2F;li&gt;
&lt;li&gt;Once loaded, you can work with the data offline, eliminating the need for thousands of HTTP calls (and the associated challenges with authentication, caching, etc.)&lt;&#x2F;li&gt;
&lt;li&gt;SQL is an extremely powerful language and is often one of the first things every programmer should learn.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;h3 id=&quot;the-rise-of-sqlite-duckdb-and-parquet&quot;&gt;The Rise of SQLite, DuckDB and Parquet&lt;&#x2F;h3&gt;
&lt;p&gt;In recent years, there has been a surge of interest in &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;sqlite.org&quot;&gt;SQLite&lt;&#x2F;a&gt;, an embedded database that has long been used in many applications. SQLite offers several advantages:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;It&#x27;s contained in a single file&lt;&#x2F;li&gt;
&lt;li&gt;It doesn&#x27;t require a server&lt;&#x2F;li&gt;
&lt;li&gt;It&#x27;s fast and efficient&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;While SQLite isn&#x27;t suitable for scenarios requiring network access or massive multiple writes, it&#x27;s perfect for distributing self-contained, ready-to-use data.&lt;&#x2F;p&gt;
&lt;p&gt;It&#x27;s worth noting the significant contributions of &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;simonwillison.net&#x2F;&quot;&gt;Simon Willison&lt;&#x2F;a&gt; in this area, particularly his suite of tools in the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;datasette.io&#x2F;&quot;&gt;Datasette&lt;&#x2F;a&gt; project. Other notable figures in this field include:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;alexgarcia.xyz&#x2F;&quot;&gt;Alex Garcia&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;x.com&#x2F;benbjohnson&quot;&gt;Ben Johnson&lt;&#x2F;a&gt; for Litestream&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;antonz.org&#x2F;all&#x2F;&quot;&gt;Anton Zhiyanov&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;An extremely powerful new tool, similar to SQLite, that recently reached its 1.0 milestone is &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;duckdb.org&#x2F;&quot;&gt;DuckDB&lt;&#x2F;a&gt;. This column-oriented database focuses on OLAP applications, making it less suitable for generic use in web applications. However, DuckDB has a standout feature: it can read and write &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Apache_Parquet&quot;&gt;Parquet&lt;&#x2F;a&gt; files, a highly efficient data storage format.&lt;&#x2F;p&gt;
&lt;p&gt;I was particularly inspired by &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;benschmidt.org&#x2F;&quot;&gt;Ben Schmidt&#x27;s&lt;&#x2F;a&gt; blog post, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;benschmidt.org&#x2F;post&#x2F;2022-03-19-better-texts&#x2F;&quot;&gt;Sharing texts better&lt;&#x2F;a&gt;. It introduced me to Parquet, a smart technology that can provide a robust foundation for data distribution, analysis, and manipulation in the cultural heritage sector.
DuckDB makes working with Parquet easy, but there are plenty of libraries (and ETL platforms) to wire Parquet data into any application.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;practical-examples&quot;&gt;Practical Examples&lt;&#x2F;h2&gt;
&lt;p&gt;Let&#x27;s explore some real-world applications of these technologies in the cultural heritage domain:&lt;&#x2F;p&gt;
&lt;h3 id=&quot;1-digipres-practice-index&quot;&gt;1. Digipres Practice Index&lt;&#x2F;h3&gt;
&lt;p&gt;The &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;www.digipres.org&#x2F;publications&#x2F;&quot;&gt;Digital Preservation Practice Index&lt;&#x2F;a&gt; is an &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;digipres&#x2F;digipres-practice-index&quot;&gt;experiment&lt;&#x2F;a&gt; by &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;anjackson.net&#x2F;&quot;&gt;Andy Jackson&lt;&#x2F;a&gt; that collects sources of information about digital preservation practices. It&#x27;s already distributed as a SQLite file and can be browsed using Datasette Lite, requiring only a web browser and no additional software installation.&lt;&#x2F;p&gt;
&lt;p&gt;In &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;digipres-index-parquet&quot;&gt;this repository&lt;&#x2F;a&gt;, I demonstrate the simple steps needed to convert the SQLite file to Parquet format using DuckDB. The &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;shell.duckdb.org&#x2F;#queries=v0,select-title-from-&amp;#x27;https%3A%2F%2Fatomotic.github.io%2Fdigipres%20index%20parquet%2Fdigipres_index.parquet&amp;#x27;-limit-10~,select-title%2Clanding_page_url-from-&amp;#x27;https%3A%2F%2Fatomotic.github.io%2Fdigipres%20index%20parquet%2Fdigipres_index.parquet&amp;#x27;-where-keywords-like-&amp;#x27;%25warc%25&amp;#x27;~&quot;&gt;resulting file&lt;&#x2F;a&gt; can be queried directly in the browser using DuckDB Shell (note: it reads &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=29040120&quot;&gt;Parquet with HTTP Range Requests&lt;&#x2F;a&gt;, so the file is partially downloaded over the network).&lt;&#x2F;p&gt;
&lt;h3 id=&quot;2-abi-anagrafe-biblioteche-italiane&quot;&gt;2. ABI Anagrafe Biblioteche Italiane&lt;&#x2F;h3&gt;
&lt;p&gt;Italian institutions publish a lot of open data, but everything is deeply fragmented: a lot of ontologies, many APIs (&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;api.iccu.sbn.it&#x2F;devportal&#x2F;apis&quot;&gt;official&lt;&#x2F;a&gt; and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;sbn-json-api&#x2F;&quot;&gt;unofficial&lt;&#x2F;a&gt;), and many dumps with undocumented schemas.&lt;&#x2F;p&gt;
&lt;p&gt;For example, the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;anagrafe.iccu.sbn.it&#x2F;it&#x2F;&quot;&gt;Register of Italian Libraries&lt;&#x2F;a&gt; provides a &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;opendata.anagrafe.iccu.sbn.it&#x2F;opendata.zip&quot;&gt;zip file&lt;&#x2F;a&gt; containing some CSVs and XML files, and you need to struggle a bit to link pieces together.
In this repository &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;abi&quot;&gt;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;abi&lt;&#x2F;a&gt;, I created some scripts to transform all those sources into a Parquet file. XMLs are converted to JSON, which is still not the perfect approach, but a single file of 5MB contains all that information. And you can play with it online using the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;shell.duckdb.org&#x2F;#queries=v0,CREATE-TABLE-libraries-AS-FROM-&amp;#x27;https%3A%2F%2Fatomotic.github.io%2Fabi%2Fabi.parquet&amp;#x27;~,DESCRIBE-libraries~,SELECT-id%2C-name-FROM-libraries-WHERE-location.comune.nome%3D&amp;#x27;Bologna&amp;#x27;-LIMIT-10~&quot;&gt;DuckDB shell&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;3-bni-bibliografia-nazionale-italiana&quot;&gt;3. BNI Bibliografia Nazionale Italiana&lt;&#x2F;h3&gt;
&lt;p&gt;The &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;it.wikipedia.org&#x2F;wiki&#x2F;Bibliografia_nazionale_italiana&quot;&gt;Italian National Bibliography&lt;&#x2F;a&gt; (&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;bni.bncf.firenze.sbn.it&#x2F;bniweb&#x2F;menu.jsp&quot;&gt;BNI&lt;&#x2F;a&gt;) is the official repertory of publications published in Italy and received by the National Central Library of Florence in accordance with legal deposit regulations.
The dumps consist of several XML files, each containing a list of records in Unimarc XML format.&lt;&#x2F;p&gt;
&lt;p&gt;In this repository &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;bni&quot;&gt;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;bni&lt;&#x2F;a&gt;, I have created some scripts to scrape the content and convert it to Parquet. The XML sources were ~275MB, while the resulting Parquet file is just 70MB.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;the-case-for-self-contained-data-formats&quot;&gt;The Case for Self-Contained Data Formats&lt;&#x2F;h2&gt;
&lt;p&gt;When APIs are not strictly necessary, distributing data in self-contained formats can significantly enhance usability and efficiency. The advantages:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Users can work offline and faster&lt;&#x2F;li&gt;
&lt;li&gt;Reduced server maintenance: with less reliance on real-time data serving, institutions can allocate fewer resources to maintaining complex API infrastructures.&lt;&#x2F;li&gt;
&lt;li&gt;Scalability: Self-contained data formats are inherently more scalable, as they don&#x27;t suffer from traffic spikes that can overwhelm API servers.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;In these crazy times of rapid advancement in Large Language Models (LLMs) and AI technologies, it&#x27;s increasingly likely that your content will be scraped by external entities (setting aside ethical or political considerations for now). By providing data dumps, you can mitigate potential issues for your infrastructure.&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Building a simple IIIF digital library with Tropy, Tropiiify and Canopy</title> <published>2024-06-05T00:00:00+00:00</published>
                <updated>2024-06-05T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/iiif-tropy-canopy/" />
                <link rel="alternate"
            href="https://literarymachin.es/iiif-tropy-canopy/" type="text/html" />
                <id>https://literarymachin.es/iiif-tropy-canopy/</id> <summary type="html">&lt;p&gt;Creating and maintaining an online digital collection can be a complex process involving multiple components, from organizational procedures to software solutions. With many moving parts, it&#x27;s no surprise that building and curating a digital collection can be costly, time-consuming, and demanding to maintain. When dealing with cultural heritage, maintenance and long-term preservation should be our primary concerns. The approach we should always consider is &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;go-dh.github.io&#x2F;mincomp&#x2F;thoughts&#x2F;2016&#x2F;10&#x2F;03&#x2F;tldr&#x2F;&quot;&gt;minimal computing&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;Creating and maintaining an online digital collection can be a complex process involving multiple components, from organizational procedures to software solutions. With many moving parts, it&#x27;s no surprise that building and curating a digital collection can be costly, time-consuming, and demanding to maintain. When dealing with cultural heritage, maintenance and long-term preservation should be our primary concerns. The approach we should always consider is &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;go-dh.github.io&#x2F;mincomp&#x2F;thoughts&#x2F;2016&#x2F;10&#x2F;03&#x2F;tldr&#x2F;&quot;&gt;minimal computing&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;In this tutorial, I&#x27;ll show you how to create and maintain a simple &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif.io&quot;&gt;IIIF&lt;&#x2F;a&gt; collection using &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;tropy.org&quot;&gt;Tropy&lt;&#x2F;a&gt; and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;canopy-iiif.github.io&#x2F;docs&quot;&gt;Canopy&lt;&#x2F;a&gt;, two powerful tools that can help you build static sites requiring zero maintenance.&lt;&#x2F;p&gt;
&lt;p&gt;There are many &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;IIIF&#x2F;awesome-iiif&quot;&gt;other libraries and applications&lt;&#x2F;a&gt;, including free software, that can achieve the same result. However, they often require minimal programming knowledge or the maintenance of server-side applications.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;tropy&quot;&gt;Tropy&lt;&#x2F;h2&gt;
&lt;p align=&quot;center&quot;&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;tropylogo.jpg&quot;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Tropy is a &lt;strong&gt;desktop application&lt;&#x2F;strong&gt; designed to organize and manage archival research photos, though it&#x27;s also great for managing almost any kind of image, including invoices or handwritten notes. It doesn&#x27;t require any online service; you can work offline on your desktop without needing to upload anything.&lt;&#x2F;p&gt;
&lt;p&gt;Although it&#x27;s yet another Electron application, the UI is very pleasant, minimal, and fast to use. You will quickly notice a significant improvement in your offline workflow compared to using online applications in a browser.&lt;&#x2F;p&gt;
&lt;p&gt;There&#x27;s an extensive &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;docs.tropy.org&#x2F;&quot;&gt;user guide to learn Tropy&lt;&#x2F;a&gt;, I won&#x27;t cover all the details here. Instead, I want to highlight some features I consider important:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;tropy&#x2F;tropy&#x2F;blob&#x2F;main&#x2F;db&#x2F;schema&#x2F;project.sql&quot;&gt;A Tropy project is saved into an SQLite database&lt;&#x2F;a&gt;. This is a huge advantage because your data won&#x27;t be locked inside the application. If you have programming knowledge, you can build a workflow to manage the data of a Tropy project and integrate it into any external application.&lt;&#x2F;li&gt;
&lt;li&gt;Tropy can import many image formats, including PDFs and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.bitsgalore.org&#x2F;2024&#x2F;03&#x2F;11&#x2F;multi-image-tiffs-subfiles-and-image-file-directories&quot;&gt;multi-page TIFFs&lt;&#x2F;a&gt;.&lt;&#x2F;li&gt;
&lt;li&gt;You can describe images with standard templates (a default Tropy template and a Dublin Core one) or create your own.&lt;&#x2F;li&gt;
&lt;li&gt;Tropy can be extended with plugins.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;h3 id=&quot;iiif-plugin-tropiiify&quot;&gt;IIIF Plugin: tropiiify&lt;&#x2F;h3&gt;
&lt;p align=&quot;center&quot;&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;tropiiify.svg&quot;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;One plugin that stands out is &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;arkalab&#x2F;tropiiify&quot;&gt;&lt;strong&gt;tropiiify&lt;&#x2F;strong&gt;&lt;&#x2F;a&gt;. With this plugin you can export a Tropy collection to a static IIIF collection: images will be saved in tiles (no IIIF server required), and manifests and collection files will be generated. You simply need to move the exported output to a static HTTP server (remember to configure CORS).&lt;&#x2F;p&gt;
&lt;p&gt;Notes:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Every document needs to have an &lt;code&gt;identifier&lt;&#x2F;code&gt;. Use whatever you want, for small collections also progressive numbers are sufficient. Alternatively, use UUIDs or any other unique identifiers, like &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;zelark.github.io&#x2F;nano-id-cc&#x2F;&quot;&gt;Nanoid&lt;&#x2F;a&gt; (if you don&#x27;t want to script a Nanoid generator, point your browser to &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;uuid.rocks&#x2F;nanoid&quot;&gt;UUID Nanoid Generator&lt;&#x2F;a&gt; and get a new identifier with each reload).&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p align=&quot;center&quot;&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;tropy1.jpeg&quot;&gt;&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;You can &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;docs.tropy.org&#x2F;in-the-item-view&#x2F;selections&quot;&gt;&lt;strong&gt;annotate&lt;&#x2F;strong&gt; images with Tropy&lt;&#x2F;a&gt;. The annotations will be included in the exported manifest!&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p align=&quot;center&quot;&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;tropy4.jpeg&quot;&gt;&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;You can create multiple export configurations. Set &lt;strong&gt;IIIF base id&lt;&#x2F;strong&gt; with the full public URL where you are going to publish the export&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p align=&quot;center&quot;&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;tropy3.jpeg&quot;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Here is and example collection &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;docuver.se&#x2F;statiiify&#x2F;index.json&quot;&gt;https:&#x2F;&#x2F;docuver.se&#x2F;statiiify&#x2F;index.json&lt;&#x2F;a&gt; (just some book covers shot with smartphone) that can be opened with any IIIF viewer (&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;tify.rocks&#x2F;?manifest=https%3A%2F%2Fdocuver.se%2Fstatiiify%2Findex.json&amp;amp;tify=%7B%22childManifestUrl%22%3A%22https%3A%2F%2Fdocuver.se%2Fstatiiify%2Fyosyij-w8eonh4whu7wcv%2Fmanifest.json%22%2C%22view%22%3A%22collection%22%7D&quot;&gt;tify&lt;&#x2F;a&gt; or &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif.systems&#x2F;m3&#x2F;?iiif-content=https:&#x2F;&#x2F;docuver.se&#x2F;statiiify&#x2F;index.json&quot;&gt;mirador&lt;&#x2F;a&gt;).&lt;&#x2F;p&gt;
&lt;p&gt;There are many other libraries or applications that can help you achieve the same result (&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.libvips.org&#x2F;API&#x2F;current&#x2F;Making-image-pyramids.html&quot;&gt;vips&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;glenrobson&#x2F;iiif-tiler&quot;&gt;iiif-tiler&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;iiif-prezi&#x2F;iiif-prezi&quot;&gt;iiif-prezi&lt;&#x2F;a&gt;), but they require knowledge of the shell and some scripting&#x2F;programming to put everything together.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;canopy&quot;&gt;Canopy&lt;&#x2F;h2&gt;
&lt;p&gt;An IIIF export from Tropy is ready to be used with any IIIF viewer out there. But there is another interesting application: &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;canopy-iiif.github.io&#x2F;docs&quot;&gt;Canopy&lt;&#x2F;a&gt;. It&#x27;s a static site generator for IIIF collections that includes a browsing interface (with facets), a search engine, and a IIIF image viewer (with annotations). Everything bundled in a static site that doesn&#x27;t need any server-side technology to be served.&lt;&#x2F;p&gt;
&lt;p align=&quot;center&quot;&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;canopy.jpeg&quot;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Here is a short guide to use canopy (see also their &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;canopy-iiif.github.io&#x2F;docs&#x2F;get-started&quot;&gt;documentation&lt;&#x2F;a&gt;)&lt;&#x2F;p&gt;
&lt;p&gt;Clone the repository&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;git clone https:&#x2F;&#x2F;github.com&#x2F;canopy-iiif&#x2F;canopy-iiif&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Install dependencies&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;npm i&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Configure&lt;&#x2F;p&gt;
&lt;p&gt;Edit &lt;strong&gt;.env&lt;&#x2F;strong&gt; with the full public URL where you will publish the static exported collection&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;NEXT_PUBLIC_URL=&amp;quot;https:&#x2F;&#x2F;docuver.se&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;NEXT_PUBLIC_BASE_PATH=&amp;quot;&#x2F;statiiify&#x2F;browse&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Edit &lt;strong&gt;config&#x2F;canopy.json&lt;&#x2F;strong&gt; with the IIIF collection manifest&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;{&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;quot;collection&amp;quot;: &amp;quot;https:&#x2F;&#x2F;docuver.se&#x2F;statiiify&#x2F;index.json&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;quot;devCollection&amp;quot;: &amp;quot;https:&#x2F;&#x2F;docuver.se&#x2F;statiiify&#x2F;index.json&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;quot;featured&amp;quot;: [&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    &amp;quot;https:&#x2F;&#x2F;docuver.se&#x2F;statiiify&#x2F;yosyij-w8eonh4whu7wcv&#x2F;manifest.json&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  ],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;quot;metadata&amp;quot;: [&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    &amp;quot;Title&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    &amp;quot;Creator&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    &amp;quot;Date&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    &amp;quot;Publisher&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  ],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;...&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Build&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;npm run build:static&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Deploy online: copy the content of &lt;strong&gt;out&lt;&#x2F;strong&gt; directory to your http server.&lt;&#x2F;p&gt;
&lt;p&gt;Here is a complete demo &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;docuver.se&#x2F;statiiify&#x2F;browse&#x2F;search&quot;&gt;https:&#x2F;&#x2F;docuver.se&#x2F;statiiify&#x2F;browse&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Revamping IIIF.link</title> <published>2024-05-04T00:00:00+00:00</published>
                <updated>2024-05-04T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/iiif-link/" />
                <link rel="alternate"
            href="https://literarymachin.es/iiif-link/" type="text/html" />
                <id>https://literarymachin.es/iiif-link/</id> <summary type="html">&lt;p&gt;A few &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;groups.google.com&#x2F;g&#x2F;iiif-discuss&#x2F;c&#x2F;G4Liut7OhsQ&#x2F;m&#x2F;aRn1hRXNCAAJ&quot;&gt;years ago&lt;&#x2F;a&gt;, I had developed a small application that allowed you to &quot;frame&quot; a specific part of an IIIF image and share it on the web through simple, concise URLs.
But the initial version was rudimentary and only supported IIIF 2, I&#x27;ve since revamped it using the latest release of the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;tify.rocks&quot;&gt;TIFY viewer&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;A few &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;groups.google.com&#x2F;g&#x2F;iiif-discuss&#x2F;c&#x2F;G4Liut7OhsQ&#x2F;m&#x2F;aRn1hRXNCAAJ&quot;&gt;years ago&lt;&#x2F;a&gt;, I had developed a small application that allowed you to &quot;frame&quot; a specific part of an IIIF image and share it on the web through simple, concise URLs.
But the initial version was rudimentary and only supported IIIF 2, I&#x27;ve since revamped it using the latest release of the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;tify.rocks&quot;&gt;TIFY viewer&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;tify.rocks&quot;&gt;TIFY&lt;&#x2F;a&gt; is a lightweight IIIF viewer written with VueJS. Its standout feature is the automatic reflection of document navigation states (zoom, pan, page) in the URL itself. This unique capability enables users to bookmark and share URLs effortlessly. TIFY operates entirely client-side, eliminating the need for additional services.&lt;&#x2F;p&gt;
&lt;p&gt;However, I like short, simple (and possibly persistent) URLS, ideal for sharing in various documents and messages. Enter &lt;strong&gt;iiif.link&lt;&#x2F;strong&gt;, which facilitates remote saving of current states to generate short URLs with &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;zelark.github.io&#x2F;nano-id-cc&#x2F;&quot;&gt;unique identifier&lt;&#x2F;a&gt;, following this format:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;https:&#x2F;&#x2F;iiif.link&#x2F;id&#x2F;{nanoid}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Examples:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif.link&#x2F;id&#x2F;z0hxtyw7bunq&quot;&gt;https:&#x2F;&#x2F;iiif.link&#x2F;id&#x2F;z0hxtyw7bunq&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif.link&#x2F;id&#x2F;4y021xqxitoj&quot;&gt;https:&#x2F;&#x2F;iiif.link&#x2F;id&#x2F;4y021xqxitoj&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif.link&#x2F;id&#x2F;7zeutzwjyofj&quot;&gt;https:&#x2F;&#x2F;iiif.link&#x2F;id&#x2F;7zeutzwjyofj&lt;&#x2F;a&gt; (even multiple pages or different manifests)&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;This application operates without requiring user login, ensuring complete anonymity, and once a link is generated, it cannot be modified. The user interface remains minimalistic and will continue to do so. The server, a lightweight Go application, stores data in a SQLite database.
Its source code is publicly available here &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iiif.link&quot;&gt;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iiif.link&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;p&gt;What&#x27;s missing since the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;iiif.link&#x2F;tree&#x2F;v1&quot;&gt;previous version&lt;&#x2F;a&gt; which I&#x27;m still working on:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Opengraph metadata&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;p&gt;This feature provided a preview when sharing links on social networks and messaging platforms. Old example:&lt;&#x2F;p&gt;
&lt;p&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;iiiflink1.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;HTTP headers&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Some HTTP headers exposed the IIIF resources, such as the canvas, the image, the label, the manifest. While I&#x27;m unsure of its utility, it might serve as a simpler alternative for machine consumption compared to APIs.&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;curl -I https:&#x2F;&#x2F;iiif.link&#x2F;id&#x2F;{id}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;...&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;X-Iiif-Canvas: https:&#x2F;&#x2F;___&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;X-Iiif-Image: https:&#x2F;&#x2F;____&#x2F;35,168,1703,788&#x2F;,100&#x2F;0&#x2F;default.jpg&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;X-Iiif-Label: Document title&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;X-Iiif-Manifest: https:&#x2F;&#x2F;___&#x2F;123&#x2F;manifest.json&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;X-Iiif-Page: 11&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;&lt;strong&gt;But isn&#x27;t there an IIIF standard for this?&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Indeed, the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif.io&#x2F;api&#x2F;content-state&#x2F;1.0&#x2F;&quot;&gt;Content State API 1.0&lt;&#x2F;a&gt; exists, although it&#x27;s yet to be integrated into major viewers. However, I&#x27;m considering implementing an export feature in this format for greater interoperability.&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>ArchivIIIFy</title> <published>2020-07-05T00:00:00+00:00</published>
                <updated>2020-07-05T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/archiviiify/" />
                <link rel="alternate"
            href="https://literarymachin.es/archiviiify/" type="text/html" />
                <id>https://literarymachin.es/archiviiify/</id> <summary type="html">&lt;p&gt;A short guide to download digitized books from &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.archive.org&quot;&gt;Internet Archive&lt;&#x2F;a&gt; and rehost on your own infrastructure using &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif.io&quot;&gt;IIIF&lt;&#x2F;a&gt; with &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;dbmdz.github.io&#x2F;solr-ocrhighlighting&#x2F;&quot;&gt;full-text search&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;A short guide to download digitized books from &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.archive.org&quot;&gt;Internet Archive&lt;&#x2F;a&gt; and rehost on your own infrastructure using &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif.io&quot;&gt;IIIF&lt;&#x2F;a&gt; with &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;dbmdz.github.io&#x2F;solr-ocrhighlighting&#x2F;&quot;&gt;full-text search&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;I&#x27;m an avid explorer of Internet Archive (i also contribute to it with some scans of my zine collection), and I&#x27;m used to download on my disks the content i find valuable so that i can browse and read it offline.&lt;br &#x2F;&gt;
The following guide is a quick tutorial describing some scripts and infrastructure pieces (docker) i&#x27;ve developed lately to download and rehost locally the digitized books with IIIF, allowing me to have a better viewer (where i can annotate content) and also full-text search (but note: IA has full-text search, and is good).&lt;&#x2F;p&gt;
&lt;p&gt;To start clone this repository &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;archiviiify&quot;&gt;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;archiviiify&lt;&#x2F;a&gt; and fire up the docker compose stack. It will start these containers:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;nginx&lt;&#x2F;strong&gt; that is proxying various things and hosting the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;projectmirador.org&#x2F;&quot;&gt;Mirador&lt;&#x2F;a&gt; viewer&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;ruven&#x2F;iipsrv&quot;&gt;&lt;strong&gt;iipsrv&lt;&#x2F;strong&gt;&lt;&#x2F;a&gt; (with &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;uclouvain&#x2F;openjpeg&quot;&gt;openjpeg&lt;&#x2F;a&gt; to decode JPEG2000) for serving IIIF images&lt;&#x2F;li&gt;
&lt;li&gt;&lt;strong&gt;memcached&lt;&#x2F;strong&gt; used by iipsrv&lt;&#x2F;li&gt;
&lt;li&gt;&lt;strong&gt;solr&lt;&#x2F;strong&gt; with &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;dbmdz.github.io&#x2F;solr-ocrhighlighting&#x2F;&quot;&gt;ocr highlighting plugin&lt;&#x2F;a&gt; (thanks! &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;twitter.com&#x2F;jbaiter_&quot;&gt;@jbaiter_&lt;&#x2F;a&gt;)&lt;&#x2F;li&gt;
&lt;li&gt;the &lt;strong&gt;search api&lt;&#x2F;strong&gt;: a simple &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;deno.land&quot;&gt;Deno&lt;&#x2F;a&gt; application that is translating Solr response to IIIF search response&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;The steps needed:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;archiviiify&#x2F;#download-images&quot;&gt;Download images&lt;&#x2F;a&gt; from Internet Archive&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;archiviiify&#x2F;#generate-iiif-manifest&quot;&gt;Generate IIIF Manifest&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;archiviiify&#x2F;#generate-ocr&quot;&gt;Generate OCR&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;archiviiify&#x2F;#index-to-solr&quot;&gt;Index to Solr&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;archiviiify&#x2F;#view&quot;&gt;View&lt;&#x2F;a&gt; and have fun&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;&lt;strong&gt;Disclaimer&lt;&#x2F;strong&gt;: there a lot of moving parts (and not enough glue). I&#x27;ll write a proper Makefile at some point. For every step following there is shell script in &lt;code&gt;.&#x2F;scripts&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;h3 id=&quot;download-images&quot;&gt;Download images&lt;&#x2F;h3&gt;
&lt;p&gt;Internet Archive is automatically deriving other formats when something is ingested: the digitized books after they are uploaded (with a pdf or a zip of images) are converted to &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;JPEG_2000&quot;&gt;JPEG2000&lt;&#x2F;a&gt; (also full text is extracted and other things are generated).
JPEG2000 images are ready to be used with the IIIF server, there is no need to convert it again to &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iipimage.sourceforge.io&#x2F;documentation&#x2F;images&#x2F;&quot;&gt;pyramidal&lt;&#x2F;a&gt; formats.&lt;br &#x2F;&gt;
To download use the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;jjjake&#x2F;internetarchive&quot;&gt;internetarchive cli&lt;&#x2F;a&gt;:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;ia&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; list&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -l -f&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;Single Page Processed JP2 ZIP&amp;quot; ITEM&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;example:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;ia&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; list&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -l -f&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;Single Page Processed JP2 ZIP&amp;quot; codici-immaginari-1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;https:&#x2F;&#x2F;archive.org&#x2F;download&#x2F;codici-immaginari-1&#x2F;codici-immaginari-1_jp2.zip&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Run the script that download and unzip the images into &lt;code&gt;.&#x2F;data&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;.&#x2F;scripts&#x2F;get&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; ITEM&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;generate-iiif-manifest&quot;&gt;Generate IIIF manifest&lt;&#x2F;h3&gt;
&lt;p&gt;JP2 images from &lt;code&gt;.&#x2F;data&lt;&#x2F;code&gt; directory are served by the iipsrv container following this pattern:&lt;&#x2F;p&gt;
&lt;p&gt;&lt;code&gt;data&#x2F;item&#x2F;file.jp2&lt;&#x2F;code&gt; → &lt;code&gt;http:&#x2F;&#x2F;localhost:8094&#x2F;iiif&#x2F;item&#x2F;file.jp2&#x2F;info.json&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;p&gt;To generate the IIIF manifest run (&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;deno.land&quot;&gt;Deno&lt;&#x2F;a&gt; is required to be installed locally):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;.&#x2F;scripts&#x2F;iiif&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; ITEM&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;The manifest is saved to &lt;code&gt;www&#x2F;manifests&lt;&#x2F;code&gt; and published to&lt;br &#x2F;&gt;
&lt;code&gt;http:&#x2F;&#x2F;localhost:8094&#x2F;manifests&#x2F;ITEM.json&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;p&gt;I found &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;deno.land&quot;&gt;Deno&lt;&#x2F;a&gt; extremely useful for quick prototyping. The &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;archiviiify&#x2F;blob&#x2F;master&#x2F;scripts&#x2F;make-manifest.js&quot;&gt;script to generate the manifest&lt;&#x2F;a&gt; is very simple (and incomplete). Better ways and libraries exists to produce IIIF Presentation, look at &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;iiif-commons.github.io&#x2F;manifesto&#x2F;&quot;&gt;manifesto&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;h3 id=&quot;generate-ocr&quot;&gt;Generate OCR&lt;&#x2F;h3&gt;
&lt;p&gt;Internet Archive is also running OCR and extracting full-text with ABBYY, but is not a &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;dbmdz.github.io&#x2F;solr-ocrhighlighting&#x2F;formats&#x2F;&quot;&gt;supported format&lt;&#x2F;a&gt; by the ocr highlightning plugin. I tried to convert it using this &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;raw.githubusercontent.com&#x2F;OCR-D&#x2F;format-converters&#x2F;master&#x2F;abbyy2hocr.xsl&quot;&gt;xsl&lt;&#x2F;a&gt; (saxon needed, not xsltproc) but the result is not enough, the required &lt;code&gt;ocrx_word&lt;&#x2F;code&gt; classes are missing. I&#x27;ve not looked deeply, XSLT is causing me headaches, so i gave up and went to re-OCR using &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;tesseract-ocr.github.io&#x2F;&quot;&gt;Tesseract 4&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;Run:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;.&#x2F;scripts&#x2F;ocr&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; ITEM&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;The previous script create a file with the list of images:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; find data&#x2F;ITEM&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;*&lt;&#x2F;span&gt;&lt;span&gt;.jp2 &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span&gt; ITEM.list&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;and run Tesseract (you need to specify the proper language model):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; tesseract -l ita ITEM.list ITEM hocr&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;This can take some time, to speed up things &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.gnu.org&#x2F;software&#x2F;parallel&#x2F;&quot;&gt;GNU parallel&lt;&#x2F;a&gt; could be used to generate hocr for every single images and then combine the result together with &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;tmbdev&#x2F;hocr-tools#hocr-combine&quot;&gt;hocr-combine&lt;&#x2F;a&gt;.&lt;br &#x2F;&gt;
A small fix is needed for the resulting hocr: Tesseract is naming &lt;code&gt;ocr_page&lt;&#x2F;code&gt; classes with &lt;code&gt;page_{1..n}&lt;&#x2F;code&gt;, i prefer to name with the full name of the original image file, that is contained also in the canvas identifier in the IIIF manifest&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;html&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt; &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;div&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; class&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;ocr_page&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; id&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;page_1&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; ...&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;↳&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;html&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt; &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;div&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; class&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;ocr_page&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; id&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;file_0000.jp2&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; ...&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Run&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;.&#x2F;scripts&#x2F;ocr-fix&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; ITEM&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;HOCR&quot;&gt;hOCR&lt;&#x2F;a&gt; is XHTML, would be advisable to use a proper parser (or xslt). The previous script uses some kind of cli voodoo because laziness (&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.gnu.org&#x2F;software&#x2F;parallel&#x2F;&quot;&gt;parallel&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;ericchiang&#x2F;pup&quot;&gt;pup&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;chmln&#x2F;sd&quot;&gt;sd&lt;&#x2F;a&gt; required):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #88846F;&quot;&gt;#!&#x2F;usr&#x2F;bin&#x2F;env bash&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;ITEM&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;$1&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;parallel&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -j1&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; sd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -f&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; w {&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;1&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;} {&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;2&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;} &amp;quot;ocr&#x2F;ITEM.hocr&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; \&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; :::&lt;&#x2F;span&gt;&lt;span&gt; $(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;pup&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; .ocr_page attr{id}&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;ocr&#x2F;ITEM.hocr&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; \&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; :::+&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; \$&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;find&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; data&#x2F;ITEM&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\*&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;.jp2&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -exec&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; basename {}&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; \;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;index-to-solr&quot;&gt;Index to Solr&lt;&#x2F;h3&gt;
&lt;p&gt;The hOCR file is ready to be indexed to Solr:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;POST solr&#x2F;ocr&#x2F;updates&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;   {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;   &amp;#39;id&amp;#39;: &amp;#39;ITEM&amp;#39;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;   &amp;#39;ocr_text&amp;#39;: &amp;#39;&#x2F;ocr&#x2F;ITEM.hocr&amp;#39;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;   &amp;#39;source&amp;#39;:&amp;#39;IA&amp;#39;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;   }&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Run&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;.&#x2F;scripts&#x2F;index&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; ITEM&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Go to the Solr admin at http:&#x2F;&#x2F;localhost:8983 to try some queries, or reach the iiif search api at &lt;code&gt;http:&#x2F;&#x2F;localhost:8094&#x2F;search&#x2F;ITEM?q=....&lt;&#x2F;code&gt;&lt;br &#x2F;&gt;
The query can be tweaked &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;archiviiify&#x2F;blob&#x2F;master&#x2F;iiif-search-api&#x2F;main.js#L17-L21&quot;&gt;here&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;h3 id=&quot;view&quot;&gt;View&lt;&#x2F;h3&gt;
&lt;p&gt;Open &lt;code&gt;http:&#x2F;&#x2F;localhost:8094&#x2F;mirador?manifest=ITEM&lt;&#x2F;code&gt; and enjoy reading your book with Mirador 3! This tutorial is not exclusive to Internet Archive, can be used to publish any content in IIIF.&lt;&#x2F;p&gt;
&lt;p&gt;A video that shows how it works:&lt;&#x2F;p&gt;
 &lt;video width=&quot;500&quot; height=&quot;300&quot; controls&gt;
   &lt;source src=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;archiviiify.mp4&quot; type=&quot;video&#x2F;mp4&quot;&gt;
Your browser does not support the video tag.
&lt;&#x2F;video&gt;
&lt;p&gt;Send your love to Internet Archive: use it and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;archive.org&#x2F;donate&#x2F;&quot;&gt;donate&lt;&#x2F;a&gt;!&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Pywb 2.0 - docker quickstart</title> <published>2018-01-31T00:00:00+00:00</published>
                <updated>2018-01-31T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/pywb-2/" />
                <link rel="alternate"
            href="https://literarymachin.es/pywb-2/" type="text/html" />
                <id>https://literarymachin.es/pywb-2/</id> <summary type="html">&lt;p&gt;Four years have passed since i &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;pywb-wayback-machine&#x2F;&quot;&gt;first wrote&lt;&#x2F;a&gt; of &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;pywb.readthedocs.io&#x2F;en&#x2F;latest&#x2F;&quot;&gt;pywb&lt;&#x2F;a&gt;: it was a young tool at the time, but already usable and extremely simple to deploy.
Since then a lot of works has been done by Ilya Kreymer (and others), resulting in all the new features available with the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;webrecorder.github.io&#x2F;2018&#x2F;01&#x2F;30&#x2F;pywb-release.html&quot;&gt;2.0 release&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;Four years have passed since i &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;pywb-wayback-machine&#x2F;&quot;&gt;first wrote&lt;&#x2F;a&gt; of &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;pywb.readthedocs.io&#x2F;en&#x2F;latest&#x2F;&quot;&gt;pywb&lt;&#x2F;a&gt;: it was a young tool at the time, but already usable and extremely simple to deploy.
Since then a lot of works has been done by Ilya Kreymer (and others), resulting in all the new features available with the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;webrecorder.github.io&#x2F;2018&#x2F;01&#x2F;30&#x2F;pywb-release.html&quot;&gt;2.0 release&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;Also, some very big webarchiving initiatives have moved and used &lt;strong&gt;pywb&lt;&#x2F;strong&gt; in these years: &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;webrecorder.io&quot;&gt;Webrecorder&lt;&#x2F;a&gt; itself, &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;rhizome.org&#x2F;&quot;&gt;Rhizome&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;perma.cc&#x2F;&quot;&gt;Perma&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;arquivo.pt&#x2F;&quot;&gt;Arquivo PT&lt;&#x2F;a&gt; in Portugal, the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;twitter.com&#x2F;bncfirenze&#x2F;status&#x2F;844219966505320450&quot;&gt;Italian National Library&lt;&#x2F;a&gt; in Florence (Italy), (others I&#x27;m missing).&lt;&#x2F;p&gt;
&lt;p&gt;For many years i&#x27;ve used &lt;strong&gt;pywb&lt;&#x2F;strong&gt; for my personal private webarchive on a shared host, with the setup described &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;pywb-wayback-machine&quot;&gt;here&lt;&#x2F;a&gt;. Nowadays actually shared hosts are well defunct, and cloud virtual machines are even more cheap.&lt;&#x2F;p&gt;
&lt;p&gt;The simplest way you can use pywb today for your own instance is probably &lt;strong&gt;docker&lt;&#x2F;strong&gt;.
Here a quick tutorial:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;pull the docker image&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;docker pull webrecorder&#x2F;pywb&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;create a directory to keep the collection&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;mkdir ~&#x2F;webarchive; cd ~&#x2F;webarchive&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;initialise the collection (call &lt;em&gt;my-collection&lt;&#x2F;em&gt; as you prefer)&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;docker run --rm -v ~&#x2F;webarchive:&#x2F;webarchive webrecorder&#x2F;pywb wb-manager init my-collection&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;add archived contents, copying WARCs you have previously created&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;cp $file.warc.gz ~&#x2F;webarchive&#x2F;collections&#x2F;my-collection&#x2F;archive&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;index the collection&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;docker run --rm -v ~&#x2F;webarchive:&#x2F;webarchive webrecorder&#x2F;pywb wb-manager reindex my-collection&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;a &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;oduwsdl&#x2F;ORS&#x2F;wiki&#x2F;CDXJ&quot;&gt;CDXJ&lt;&#x2F;a&gt; index will be created in &lt;code&gt;~&#x2F;webarchive&#x2F;collections&#x2F;my-collection&#x2F;indexes&#x2F;index.cdxj&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;start it: pywb will run on localhost:8080&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;docker run -d --name pywb -v ~&#x2F;webarchive:&#x2F;webarchive -p 8080:8080 webrecorder&#x2F;pywb&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;open http:&#x2F;&#x2F;localhost:8080&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Easy!&lt;&#x2F;p&gt;
&lt;p&gt;Again, why pywb has been so important in the webarchiving scene?
Because it focus on individuals, for the easiness on creating, curating and mantaining personal web archives!&lt;&#x2F;p&gt;
&lt;iframe src=&quot;https:&#x2F;&#x2F;digipres.club&#x2F;@despens&#x2F;99443704321052297&#x2F;embed&quot; class=&quot;mastodon-embed&quot; style=&quot;max-width: 100%; border: 0&quot; width=&quot;600&quot; height=&quot;400&quot;&gt;&lt;&#x2F;iframe&gt;&lt;script src=&quot;https:&#x2F;&#x2F;digipres.club&#x2F;embed.js&quot; async=&quot;async&quot;&gt;&lt;&#x2F;script&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Anonymous webarchiving</title> <published>2017-10-05T00:00:00+00:00</published>
                <updated>2017-10-05T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/anonymous-webarchiving/" />
                <link rel="alternate"
            href="https://literarymachin.es/anonymous-webarchiving/" type="text/html" />
                <id>https://literarymachin.es/anonymous-webarchiving/</id> <summary type="html">&lt;p&gt;Webarchiving activities, as any other activity where an HTTP client is involved, leave marks of their steps: the web server you are visiting or crawling will save your IP address in its logs (or even worse it can decide to ban your IP). This is usually not a problem, there are plenty of good reasons for a webserver to keep logs of its visitors.&lt;br &#x2F;&gt;
But sometimes you may need to protect your own identity when you are visiting or saving something from a website, and there a lot of sensitive careers that need this protection: activists, journalist, political dissidents.&lt;br &#x2F;&gt;
&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.torproject.org&#x2F;&quot;&gt;TOR&lt;&#x2F;a&gt; has been invented for this, and today offer a good protection to browse anonymously the web.&lt;br &#x2F;&gt;
&lt;strong&gt;Can we also archive the web through TOR?&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;Webarchiving activities, as any other activity where an HTTP client is involved, leave marks of their steps: the web server you are visiting or crawling will save your IP address in its logs (or even worse it can decide to ban your IP). This is usually not a problem, there are plenty of good reasons for a webserver to keep logs of its visitors.&lt;br &#x2F;&gt;
But sometimes you may need to protect your own identity when you are visiting or saving something from a website, and there a lot of sensitive careers that need this protection: activists, journalist, political dissidents.&lt;br &#x2F;&gt;
&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.torproject.org&#x2F;&quot;&gt;TOR&lt;&#x2F;a&gt; has been invented for this, and today offer a good protection to browse anonymously the web.&lt;br &#x2F;&gt;
&lt;strong&gt;Can we also archive the web through TOR?&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;Actually is not difficult: we need the TOR daemon running and then we have to proxy our webarchiving client through it. Every crawler (Heritrix, wget, wpull) can be configured to use a proxy.&lt;&#x2F;p&gt;
&lt;p&gt;Here i want to use &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;ikreymer&#x2F;pywb&quot;&gt;&lt;strong&gt;pywb&lt;&#x2F;strong&gt;&lt;&#x2F;a&gt;, a python implementation of the wayback machine (i wrote about it in the past!), with a &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;pywb.readthedocs.io&#x2F;en&#x2F;develop&#x2F;manual&#x2F;usage.html#using-pywb-recorder&quot;&gt;&lt;strong&gt;new recorder feature&lt;&#x2F;strong&gt;&lt;&#x2F;a&gt; that will be soon released (kudos to &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;twitter.com&#x2F;IlyaKreymer&quot;&gt;@IlyaKreymer&lt;&#x2F;a&gt; and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;twitter.com&#x2F;webrecorder_io&quot;&gt;@webrecorder&lt;&#x2F;a&gt;).&lt;&#x2F;p&gt;
&lt;p&gt;A quick guide for macos, easy to adapt to GNU&#x2F;Linux:&lt;&#x2F;p&gt;
&lt;h3 id=&quot;install-and-run-tor&quot;&gt;Install and run TOR&lt;&#x2F;h3&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ brew install tor&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ echo &amp;quot;TestSocks 1&amp;quot; | tee ~&#x2F;.torrc&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ tor -f ~&#x2F;.torrc&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Keep the daemon running in foreground. Check its output (after the last step) and verify that is logging something like this to be sure that there are no leaks:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;Oct 05 12:25:41.000 [notice] Your application (using socks5 to port 42) instructed Tor to take care of the DNS resolution itself if necessary. This is good.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;configure-torsocks&quot;&gt;Configure torsocks&lt;&#x2F;h3&gt;
&lt;p&gt;verify to have version 2.2.0:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ torsocks --version&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;Torsocks 2.2.0&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;change the default configuration:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; TORSOCKS_CONF&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&#x2F;usr&#x2F;local&#x2F;Cellar&#x2F;torsocks&#x2F;2.2.0&#x2F;etc&#x2F;tor&#x2F;torsocks.conf&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; gsed -i &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;&#x2F;AllowInbound&#x2F;s&#x2F;^#&#x2F;&#x2F;g&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt; $TORSOCKS_CONF&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; gsed -i &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;&#x2F;AllowOutboundLocalhost&#x2F;s&#x2F;^#&#x2F;&#x2F;g&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt; $TORSOCKS_CONF&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;install-pywb&quot;&gt;Install pywb&lt;&#x2F;h3&gt;
&lt;p&gt;install pywb from develop branch&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ pip3 install git+https:&#x2F;&#x2F;github.com&#x2F;ikreymer&#x2F;pywb@develop&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;create an archive&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ mkdir webarchive&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ cd webarchive&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ wb-manager init anonymous-archive&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ echo &amp;quot;recorder:live&amp;quot; | tee config.yaml&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;run-pywb-behind-tor&quot;&gt;Run pywb behind TOR&lt;&#x2F;h3&gt;
&lt;p&gt;set your shell to use Torsocks by default, every network activity will be proxied trough TOR:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ . torsocks on&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;run pywb:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;~ wayback --live -a --auto-interval 10&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;record your site:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;http:&#x2F;&#x2F;localhost:8080&#x2F;anonymous-archive&#x2F;record&#x2F;{URL-TO-RECORD}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;&lt;strong&gt;important&lt;&#x2F;strong&gt;: always use a dedicated browser for this, to avoid leaks by extensions or other custom settings. Also make sure to disable DNS Prefetch:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Firefox: &lt;code&gt;about:config&lt;&#x2F;code&gt; ➜ set &lt;code&gt;network.dns.disablePrefetch&lt;&#x2F;code&gt; to &lt;code&gt;true&lt;&#x2F;code&gt;&lt;&#x2F;li&gt;
&lt;li&gt;Chrome: &lt;em&gt;Settings&lt;&#x2F;em&gt; ➜ &lt;em&gt;Advanced&lt;&#x2F;em&gt; ➜ &lt;em&gt;Privacy and security&lt;&#x2F;em&gt; ➜ toggle the &quot;&lt;em&gt;Use a prediction service to load pages more quickly&lt;&#x2F;em&gt;&quot;&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Browse the site, everything will be recorded inside
&lt;code&gt;.&#x2F;collections&#x2F;anonymous-archive&lt;&#x2F;code&gt;
You can replay the recordings still using pywb or also &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;webrecorder&#x2F;webrecorderplayer-electron&quot;&gt;Webrecorder Player&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Beware: double check every step and make sure to test it with a known website where you can check the access log to verify that the IP address that is hitting the server is not yours.&lt;&#x2F;strong&gt;
Or, even better, record &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;check.torproject.org&quot;&gt;https:&#x2F;&#x2F;check.torproject.org&lt;&#x2F;a&gt; and verify if this message is obtained:&lt;&#x2F;p&gt;
&lt;p&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;pywb-tor.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Open BNI</title> <published>2016-09-03T00:00:00+00:00</published>
                <updated>2016-09-03T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/open-bni/" />
                <link rel="alternate"
            href="https://literarymachin.es/open-bni/" type="text/html" />
                <id>https://literarymachin.es/open-bni/</id> <summary type="html">&lt;p&gt;Il &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;mailman.wikimedia.it&#x2F;pipermail&#x2F;bibliotecari&#x2F;2016-May&#x2F;003789.html&quot;&gt;30 maggio 2016&lt;&#x2F;a&gt; viene annunciato il rilascio libero della &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;it.wikipedia.org&#x2F;wiki&#x2F;Bibliografia_nazionale_italiana&quot;&gt;Bibliografia Nazionale Italiana&lt;&#x2F;a&gt; (BNI). Viene apprezzata l&#x27;apertura di questo catalogo (anche se con i limiti dei soli pdf), e da profano di biblioteconomia faccio anche una &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;mailman.wikimedia.it&#x2F;pipermail&#x2F;bibliotecari&#x2F;2016-May&#x2F;003790.html&quot;&gt;domanda&lt;&#x2F;a&gt; sull&#x27;effettivo caso d&#x27;uso della BNI.&lt;br &#x2F;&gt;
Il &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;mailman.wikimedia.it&#x2F;pipermail&#x2F;bibliotecari&#x2F;2016-September&#x2F;003831.html&quot;&gt;30 agosto 2016&lt;&#x2F;a&gt; viene annunciato il rilascio delle annate 2015 e 2016 anche in formato UNIMARC e MARCXML.&lt;br &#x2F;&gt;
Incuriosito dal catalogo inizio ad esplorarlo, per pensare a possibili trasformazioni (triple rdf) o arricchimenti con&#x2F;verso altri dati (wikidata).&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;Il &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;mailman.wikimedia.it&#x2F;pipermail&#x2F;bibliotecari&#x2F;2016-May&#x2F;003789.html&quot;&gt;30 maggio 2016&lt;&#x2F;a&gt; viene annunciato il rilascio libero della &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;it.wikipedia.org&#x2F;wiki&#x2F;Bibliografia_nazionale_italiana&quot;&gt;Bibliografia Nazionale Italiana&lt;&#x2F;a&gt; (BNI). Viene apprezzata l&#x27;apertura di questo catalogo (anche se con i limiti dei soli pdf), e da profano di biblioteconomia faccio anche una &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;mailman.wikimedia.it&#x2F;pipermail&#x2F;bibliotecari&#x2F;2016-May&#x2F;003790.html&quot;&gt;domanda&lt;&#x2F;a&gt; sull&#x27;effettivo caso d&#x27;uso della BNI.&lt;br &#x2F;&gt;
Il &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;mailman.wikimedia.it&#x2F;pipermail&#x2F;bibliotecari&#x2F;2016-September&#x2F;003831.html&quot;&gt;30 agosto 2016&lt;&#x2F;a&gt; viene annunciato il rilascio delle annate 2015 e 2016 anche in formato UNIMARC e MARCXML.&lt;br &#x2F;&gt;
Incuriosito dal catalogo inizio ad esplorarlo, per pensare a possibili trasformazioni (triple rdf) o arricchimenti con&#x2F;verso altri dati (wikidata).&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;Ecco come esplorare i dati con &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;basex.org&quot;&gt;basex&lt;&#x2F;a&gt; e &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;it.wikipedia.org&#x2F;wiki&#x2F;XQuery&quot;&gt;xquery&lt;&#x2F;a&gt; e un notebook &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;jupyter.org&#x2F;&quot;&gt;jupyter&lt;&#x2F;a&gt;:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;download degli xml (mi limito all&#x27;analisi delle monografie):&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; curl -OJLs &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;http:&#x2F;&#x2F;bni.bncf.firenze.sbn.it&#x2F;bniweb&#x2F;scaricaxml.jsp?mese=01&amp;amp;anno=2015&amp;amp;serie=Monografie&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; curl -OJLs &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;http:&#x2F;&#x2F;bni.bncf.firenze.sbn.it&#x2F;bniweb&#x2F;scaricaxml.jsp?mese=02&amp;amp;anno=2015&amp;amp;serie=Monografie&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; curl -OJLs &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;http:&#x2F;&#x2F;bni.bncf.firenze.sbn.it&#x2F;bniweb&#x2F;scaricaxml.jsp?mese=03&amp;amp;anno=2015&amp;amp;serie=Monografie&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; curl -OJLs &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;http:&#x2F;&#x2F;bni.bncf.firenze.sbn.it&#x2F;bniweb&#x2F;scaricaxml.jsp?mese=01&amp;amp;anno=2016&amp;amp;serie=Monografie&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; sha1sum Monografie&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\*&lt;&#x2F;span&gt;&lt;span&gt;.xml&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;7c226c88daefd7b145ebb0bc01d621ba9f3ea9b3&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; Monografie201501.xml&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;204134fef0f5275f466feb9c6a018c794fadd07b&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; Monografie201502.xml&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;bdbcab246290b9d2e0db3b7279bd32ea20ea6ef3&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; Monografie201503.xml&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;c8e56442bc5c8a1e7fb9e31731108ba586993c17&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; Monografie201601.xml&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;ul&gt;
&lt;li&gt;installazione di &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;basex.org&quot;&gt;basex&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;debian&#x2F;ubuntu: &lt;code&gt;~ apt-get install basex&lt;&#x2F;code&gt;
macos: &lt;code&gt;~ brew install basex&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;creazione del database e caricamento degli xml&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; basex -c &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;create database bni&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; basex -i bni -c &lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;add Monografie201501.xml; add Monografie201502.xml;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; add Monografie201503.xml; Monografie201601.xml&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;ul&gt;
&lt;li&gt;avvio del db&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; basexserver&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;ul&gt;
&lt;li&gt;installazione di jupyter notebook e della libreria client in python per basex&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; pip3 install jupyter&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; pip3 install basexclient&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;~&lt;&#x2F;span&gt;&lt;span&gt; jupyter notebook&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;In nuovo notebook si possono quindi iniziare ad estrarre i dati con xquery. Esempio di una semplice funzione che conta le occorrenze di un path (xpath):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;python&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;from&lt;&#x2F;span&gt;&lt;span&gt; BaseXClient&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; import&lt;&#x2F;span&gt;&lt;span&gt; BaseXClient&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;session&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span&gt; BaseXClient.Session(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;127.0.0.1&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 1984&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;admin&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;admin&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;publisher&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;&#x2F;&#x2F;rec&#x2F;df[@t=&amp;quot;210&amp;quot;]&#x2F;sf[@c=&amp;quot;c&amp;quot;]&amp;#39;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;publisher_city&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;&#x2F;&#x2F;rec&#x2F;df[@t=&amp;quot;210&amp;quot;]&#x2F;sf[@c=&amp;quot;a&amp;quot;]&amp;#39;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;subject&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;&#x2F;&#x2F;rec&#x2F;df[@t=&amp;quot;606&amp;quot;]&#x2F;sf[@c=&amp;quot;a&amp;quot;]&amp;#39;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;def&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; count&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt;path&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #FD971F;font-style: italic;&quot;&gt; limit&lt;&#x2F;span&gt;&lt;span&gt;):&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;q&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;&amp;#39;&amp;#39;let $db := db:open(&amp;quot;bni&amp;quot;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    let $result :=&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;        for $publisher in distinct-values($db&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;{0}&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;let $count :=  count(index-of($db&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;{0}&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;, $publisher))&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;              order by $count descending&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;        return concat($publisher, &amp;quot;, &amp;quot;, $count)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;    for $limited at $lim in subsequence($result, 1, &lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;{1}&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; return $limited&amp;#39;&amp;#39;&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;.format(path, limit)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;query&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span&gt; session.query(q)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;for&lt;&#x2F;span&gt;&lt;span&gt; \&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F44747;&quot;&gt;_, item in query.iter():&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;print&lt;&#x2F;span&gt;&lt;span&gt;(item)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;I 10 editori e le città con maggior numero di pubblicazioni:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;python&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;count(publisher,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 10&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;count(publisher_city,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 10&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;oppure i 30 soggetti piu&#x27; usati (&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;unimarc-it.wikidot.com&#x2F;606&quot;&gt;field 606&lt;&#x2F;a&gt;)&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;python&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;count(subject,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 30&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Il notebook già pronto può essere scaricato da &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;bni-xquery&#x2F;blob&#x2F;master&#x2F;bni-xquery.ipynb&quot;&gt;atomotic&#x2F;bni-xquery&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;h3 id=&quot;conclusioni&quot;&gt;Conclusioni&lt;&#x2F;h3&gt;
&lt;p&gt;Cosa fare con questi dati? Sicuramente arricchire wikidata, ad esempio ci sono pochissimi item di &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;tinyurl.com&#x2F;j9xfkqz&quot;&gt;editori italiani&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;La BNI può essere considerato un primo passo verso l&#x27;apertura completa del catalogo OPAC? Speriamo di si.&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Epub linkrot</title> <published>2015-03-03T00:00:00+00:00</published>
                <updated>2015-03-03T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/epub-linkrot/" />
                <link rel="alternate"
            href="https://literarymachin.es/epub-linkrot/" type="text/html" />
                <id>https://literarymachin.es/epub-linkrot/</id> <summary type="html">&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Link_rot&quot;&gt;Linkrot&lt;&#x2F;a&gt; also affects epub files (who would have thought! :)).&lt;br &#x2F;&gt;
How to check the health of external links in epub books (required tools: a shell, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;savannah.nongnu.org&#x2F;projects&#x2F;atool&quot;&gt;atool&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;EricChiang&#x2F;pup&quot;&gt;pup&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;www.gnu.org&#x2F;software&#x2F;parallel&quot;&gt;gnu parallel&lt;&#x2F;a&gt;).&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Link_rot&quot;&gt;Linkrot&lt;&#x2F;a&gt; also affects epub files (who would have thought! :)).&lt;br &#x2F;&gt;
How to check the health of external links in epub books (required tools: a shell, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;savannah.nongnu.org&#x2F;projects&#x2F;atool&quot;&gt;atool&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;EricChiang&#x2F;pup&quot;&gt;pup&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;www.gnu.org&#x2F;software&#x2F;parallel&quot;&gt;gnu parallel&lt;&#x2F;a&gt;).&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;&lt;h3 id=&quot;extract-all-external-links&quot;&gt;extract all external links&lt;&#x2F;h3&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; acat&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -F&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; zip {FILE.epub} &amp;quot;_.xhtml&amp;quot; &amp;quot;_.html&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; \&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;|&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; pup&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;a attr{href}&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; \&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;|&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; egrep&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;quot;^http&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; |&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; sort&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; |&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; uniq&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; \&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; links.txt&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;check-http-status&quot;&gt;check http status&lt;&#x2F;h3&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; echo &amp;quot;http_code, url&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; links-status.csv&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; parallel&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -j 10&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;curl -k -L -s -o &#x2F;dev&#x2F;null -w &amp;quot;%{http_code}&amp;quot; {}; echo &amp;quot;, {}\n&amp;quot;&amp;#39; :::: links.txt&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; links-status.csv&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;&lt;code&gt;link-status.csv&lt;&#x2F;code&gt; is a csv and contains &lt;code&gt;http_code&lt;&#x2F;code&gt; and original &lt;code&gt;url&lt;&#x2F;code&gt;. Installing &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;csvkit.readthedocs.org&#x2F;en&#x2F;0.9.0&#x2F;&quot;&gt;csvkit&lt;&#x2F;a&gt; you can perform this simple analysis:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; csvstat&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; --freq -c&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; http_code links-status.csv&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; |&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; jq&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; .&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;And you obtain a summary view by &lt;code&gt;http_code&lt;&#x2F;code&gt;. The following example is extracted from a book i bought in 2011 (14 links gone):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;{&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;quot;403&amp;quot;: 1,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;quot;301&amp;quot;: 2,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;quot;404&amp;quot;: 14,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  &amp;quot;200&amp;quot;: 95&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h3 id=&quot;what-to-do&quot;&gt;what to do?&lt;&#x2F;h3&gt;
&lt;p&gt;And so you&#x27;ve discovered that your loved ebooks are full or rotten links.
What you could do:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;you are a compulsive reader and digital books hoarder?&lt;&#x2F;em&gt; archive by yourself links from the book after you&#x27;ve bought (remove drm from your own books is also a safe thing)&lt;&#x2F;li&gt;
&lt;li&gt;&lt;em&gt;you are an author or selfpublisher (or equivalent hipster term to describe it)?&lt;&#x2F;em&gt; archive links from the book you are writing to internet archive waybackmachine (using &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;gijn.org&#x2F;2015&#x2F;01&#x2F;27&#x2F;introducing-the-research-desk-secrets-of-the-wayback-machine&#x2F;&quot;&gt;Save Page Now&lt;&#x2F;a&gt;), than link the archived version. &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;robustlinks.mementoweb.org&#x2F;&quot;&gt;Robust Links&lt;&#x2F;a&gt; are not an option now, unless some epub reader client is implementing it right now.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;em&gt;you are a big publisher?&lt;&#x2F;em&gt; consider to manage your own web archive and offer it as a service to your authors&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>SKOS Nuovo Soggettario, api e autocomplete</title> <published>2015-02-26T00:00:00+00:00</published>
                <updated>2015-02-26T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/skos-autocomplete/" />
                <link rel="alternate"
            href="https://literarymachin.es/skos-autocomplete/" type="text/html" />
                <id>https://literarymachin.es/skos-autocomplete/</id> <summary type="html">&lt;p&gt;Come creare una api per un form con autocompletamento usando i termini del Nuovo Soggettario, con i Sorted Sets di Redis e Nginx+Lua.&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;Come creare una api per un form con autocompletamento usando i termini del Nuovo Soggettario, con i Sorted Sets di Redis e Nginx+Lua.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;nuovosoggettario.jpg&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Il &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;thes.bncf.firenze.sbn.it&quot;&gt;Nuovo Soggettario&lt;&#x2F;a&gt;, disponibile in formato &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;thes.bncf.firenze.sbn.it&#x2F;dati&#x2F;NS-SKOS.zip&quot;&gt;SKOS&lt;&#x2F;a&gt; (CC BY), può essere facilmente usato per creare delle api da usare per servizi di normalizzazione, inserimento dati, catalogazione. E&#x27; un set di dati abbastanza piccolo, tale da non rendere necessario l&#x27;uso di strumenti sofisticati come SOLR o ElasticSearch.&lt;&#x2F;p&gt;
&lt;p&gt;Il tipo di dato &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;redis.io&#x2F;topics&#x2F;data-types#sorted-sets&quot;&gt;Sorted Sets&lt;&#x2F;a&gt; di Redis e il comando &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;redis.io&#x2F;commands&#x2F;ZRANGEBYLEX#details-on-strings-comparison&quot;&gt;ZRANGEBYLEX&lt;&#x2F;a&gt; sono una soluzione molto efficace e semplice per realizzare sistemi di &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;autocomplete.redis.io&#x2F;&quot;&gt;autocompletamento&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;creazione-dell-indice&quot;&gt;creazione dell&#x27;indice&lt;&#x2F;h2&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;nuovosoggettario-skos-redis&quot;&gt;nuovosoggettario-skos-redis&lt;&#x2F;a&gt; contiene uno script dimostrativo in python (moduli richiesti lxml e redis):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ git clone https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;nuovosoggettario-skos-redis.git&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ cd nuovosoggettario-skos-redis&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ pip install -r requirements.txt&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Download del soggettario:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ mkdir xml&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ wget http:&#x2F;&#x2F;thes.bncf.firenze.sbn.it&#x2F;dati&#x2F;NS-SKOS.zip&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ unzip NS-SKOS.zip -d xml&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ rm NS-SKOS.zip&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Indicizzazione (usate &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;www.gnu.org&#x2F;s&#x2F;parallel&quot;&gt;gnu parallel&lt;&#x2F;a&gt; o xargs per caricarli in parallelo):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ redis-server &amp;amp;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ parallel -j 8 &#x2F;usr&#x2F;bin&#x2F;env python index.py {} ::: xml&#x2F;*.xml&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Nell&#x27;esempio vengono indicizzate la &lt;strong&gt;prefLabel&lt;&#x2F;strong&gt; e tutte le &lt;strong&gt;altLabel&lt;&#x2F;strong&gt;
(si lo so, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;nuovosoggettario-skos-redis&#x2F;blob&#x2F;master&#x2F;index.py&quot;&gt;index.py&lt;&#x2F;a&gt; imbroglia parsando l&#x27;xml, ma il parsing rdf con rdflib è estremamente più lento).&lt;&#x2F;p&gt;
&lt;p&gt;Ricerca di esempio:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ redis-cli --raw&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;127.0.0.1:6379&amp;gt; ZRANGEBYLEX autocomplete [archiv &amp;quot;[archiv\xff&amp;quot; LIMIT 0 5&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;archivi capitolari:{&amp;quot;label&amp;quot;:&amp;quot;Archivi capitolari&amp;quot;, &amp;quot;id&amp;quot;:&amp;quot;http:&#x2F;&#x2F;purl.org&#x2F;bncf&#x2F;tid&#x2F;17165&amp;quot;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;archivi comunali:{&amp;quot;label&amp;quot;:&amp;quot;Archivi comunali&amp;quot;, &amp;quot;id&amp;quot;:&amp;quot;http:&#x2F;&#x2F;purl.org&#x2F;bncf&#x2F;tid&#x2F;32025&amp;quot;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;archivi correnti:{&amp;quot;label&amp;quot;:&amp;quot;Archivi correnti&amp;quot;, &amp;quot;id&amp;quot;:&amp;quot;http:&#x2F;&#x2F;purl.org&#x2F;bncf&#x2F;tid&#x2F;52282&amp;quot;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;archivi di autorità di nomi e titoli:{&amp;quot;label&amp;quot;:&amp;quot;Archivi di autorità di nomi e titoli&amp;quot;,     &amp;quot;id&amp;quot;:&amp;quot;http:&#x2F;&#x2F;purl.org&#x2F;bncf&#x2F;tid&#x2F;2260&amp;quot;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;archivi di autorità:{&amp;quot;label&amp;quot;:&amp;quot;Archivi di autorità&amp;quot;, &amp;quot;id&amp;quot;:&amp;quot;http:&#x2F;&#x2F;purl.org&#x2F;bncf&#x2F;tid&#x2F;2261&amp;quot;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Memoria in uso:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ redis-cli info | grep used_memory_human&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;used_memory_human:9.57M&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h2 id=&quot;api-di-ricerca&quot;&gt;api di ricerca&lt;&#x2F;h2&gt;
&lt;p&gt;La api web può essere realizzata in qualsiasi linguaggio. Seguendo questo post &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;www.cucumbertown.com&#x2F;craft&#x2F;autocomplete-using-redis-nginx-lua&#x2F;&quot;&gt;Redis on steroids: Autocomplete using Redis, Nginx and Lua&lt;&#x2F;a&gt; ho voluto provare con uno script &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;wiki.nginx.org&#x2F;HttpLuaModule&quot;&gt;Lua&lt;&#x2F;a&gt; in Nginx.&lt;&#x2F;p&gt;
&lt;p&gt;Su Debian (testing, sid) basta installare &lt;code&gt;nginx&lt;&#x2F;code&gt; e &lt;code&gt;nginx-extras&lt;&#x2F;code&gt;, diversamente bisogna compilare a mano &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;openresty.org&#x2F;&quot;&gt;Openresty&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;openresty&#x2F;lua-resty-redis&quot;&gt;lua-resty-redis&lt;&#x2F;a&gt; non è ancora aggiornato per usare il comando ZRANGEBYLEX, va aggiunto:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ mkdir &#x2F;srv&#x2F;nginx-lua; cd &#x2F;srv&#x2F;nginx-lua&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ wget https:&#x2F;&#x2F;raw.githubusercontent.com&#x2F;openresty&#x2F;lua-resty-redis&#x2F;master&#x2F;lib&#x2F;resty&#x2F;redis.lua&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ sed -i &amp;#39;s&#x2F;\&amp;quot;zscan\&amp;quot;&#x2F;\&amp;quot;zscan\&amp;quot;,\&amp;quot;zrangebylex\&amp;quot;&#x2F;g&amp;#39; redis.lua&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;script di ricerca: &#x2F;srv&#x2F;nginx-lua&#x2F;&lt;strong&gt;autocomplete.lua&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;p&gt;{% highlight lua %}
local redis = require &quot;redis&quot;
local red = redis:new()
red:set_timeout(1000)
ngx.header.content_type = &#x27;text&#x2F;plain&#x27;;
local ok, err = red:connect(&quot;127.0.0.1&quot;, 6379)
if not ok then
ngx.status = ngx.HTTP_SERVICE_UNAVAILABLE
ngx.say(&quot;Redis down&quot;)
return
end
local q=ngx.req.get_uri_args().q
if not q then
ngx.status = ngx.HTTP_BAD_REQUEST
ngx.say(&quot;arguments missing&quot;)
return
end
local res, err = red:zrangebylex(&quot;autocomplete&quot;, &quot;[&quot;..q, &quot;[&quot;..q..&quot;\xff&quot;,&quot;LIMIT&quot;, &quot;0&quot;, &quot;100&quot;)&lt;&#x2F;p&gt;
&lt;p&gt;ngx.say(&quot;[&quot;)
table.foreach(res, function(k,v) ngx.say(string.match(v, &quot;{.*}&quot;) .. &quot;,&quot; ) end)
ngx.say(&quot;{&quot;label&quot;:&quot;&quot;, &quot;id&quot;:&quot;&quot;}]&quot;)&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;configurazione del virtualhost in nginx, lo script è servito da **&#x2F;ns-bncf&#x2F;autocomplete**:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;```nginx&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;lua_package_path &amp;quot;&#x2F;srv&#x2F;nginx-lua&#x2F;?.lua;;&amp;quot;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;server {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;listen 80 default_server;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;root ....;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;index index.html index.htm;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;server_name ....;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    location &#x2F; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            try_files $uri $uri&#x2F; =404;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    }&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    location = &#x2F;ns-bncf&#x2F;autocomplete {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            content_by_lua_file &#x2F;srv&#x2F;nginx-lua&#x2F;autocomplete.lua;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    }&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;test dell&#x27;api, viene restituito un array di oggetti json:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ curl http:&#x2F;&#x2F;atomotic.com&#x2F;ns-bncf&#x2F;autocomplete?q=archiv&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;a questo punto &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;twitter.github.io&#x2F;typeahead.js&quot;&gt;typeahead&lt;&#x2F;a&gt; (o in alternativa jquery-autocomplete) possono essere usati per costruire una select con autocompletamento.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;DEMO&lt;&#x2F;strong&gt;: &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;atomotic.com&#x2F;ns-bncf&quot;&gt;http:&#x2F;&#x2F;atomotic.com&#x2F;ns-bncf&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;h2 id=&quot;possibili-utilizzi&quot;&gt;Possibili utilizzi&lt;&#x2F;h2&gt;
&lt;ul&gt;
&lt;li&gt;un plugin per wordpress che usi i termini del soggettario come categorie&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;wiki.eprints.org&#x2F;w&#x2F;Autocompletion#external_source&quot;&gt;autocomplete&lt;&#x2F;a&gt; di EPrints&lt;&#x2F;li&gt;
&lt;li&gt;un reconciliation service per OpenRefine&lt;&#x2F;li&gt;
&lt;li&gt;....&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Serve deepzoom images from a zip archive with openseadragon</title> <published>2014-11-23T00:00:00+00:00</published>
                <updated>2014-11-23T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/deepzoom-osd-server/" />
                <link rel="alternate"
            href="https://literarymachin.es/deepzoom-osd-server/" type="text/html" />
                <id>https://literarymachin.es/deepzoom-osd-server/</id> <summary type="html">&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;www.vips.ecs.soton.ac.uk&#x2F;index.php?title=VIPS&quot;&gt;vips&lt;&#x2F;a&gt; is a fast image processing system. Version &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;lists.andrew.cmu.edu&#x2F;pipermail&#x2F;openslide-users&#x2F;2014-June&#x2F;000832.html&quot;&gt;higher than 7.40&lt;&#x2F;a&gt; can generate static tiles of big images in &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Deep_Zoom&quot;&gt;deepzoom&lt;&#x2F;a&gt; format, saving them directly into a zip archive.&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;www.vips.ecs.soton.ac.uk&#x2F;index.php?title=VIPS&quot;&gt;vips&lt;&#x2F;a&gt; is a fast image processing system. Version &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;lists.andrew.cmu.edu&#x2F;pipermail&#x2F;openslide-users&#x2F;2014-June&#x2F;000832.html&quot;&gt;higher than 7.40&lt;&#x2F;a&gt; can generate static tiles of big images in &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Deep_Zoom&quot;&gt;deepzoom&lt;&#x2F;a&gt; format, saving them directly into a zip archive.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;Simple as: &lt;code&gt;$ vips dzsave big.jpg image.zip&lt;&#x2F;code&gt;&lt;br &#x2F;&gt;
(note: if you compile vips, verify to have &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;jcupitt&#x2F;libvips&#x2F;issues&#x2F;173&quot;&gt;libgsf1-dev&lt;&#x2F;a&gt; installed)&lt;&#x2F;p&gt;
&lt;p&gt;A zip archive is more convenient than having thousand of small image files sparsed into a filesystem, so i was thinking
a simple way to serve it with openseadragon directly from zip, without extracting.&lt;br &#x2F;&gt;
In my desire to learn better &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;golang.org&quot;&gt;Go&lt;&#x2F;a&gt;, i&#x27;ve built this &lt;strong&gt;&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;deepzoom-osd-server&quot;&gt;deepzoom-osd-server&lt;&#x2F;a&gt;&lt;&#x2F;strong&gt;, a static web application that embeds openseadragon.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;compile&quot;&gt;Compile:&lt;&#x2F;h2&gt;
&lt;p&gt;install &lt;code&gt;gom&lt;&#x2F;code&gt; package manager&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ go get github.com&#x2F;mattn&#x2F;gom&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;clone and compile&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ git clone https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;deepzoom-osd-server.git&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ cd deepzoom-osd-server&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ make&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;The &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;atomotic&#x2F;deepzoom-osd-server&#x2F;blob&#x2F;master&#x2F;Makefile&quot;&gt;Makefile&lt;&#x2F;a&gt; will download the latest &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;openseadragon&#x2F;openseadragon&#x2F;releases&#x2F;download&#x2F;v1.1.1&#x2F;openseadragon-bin-1.1.1.tar.gz&quot;&gt;binary&lt;&#x2F;a&gt; of openseadragon, then bundle all dependencies into &lt;code&gt;_vendor&lt;&#x2F;code&gt;, then build the binary embedding all static assets (with &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;tebeka&#x2F;nrsc&quot;&gt;nrsc&lt;&#x2F;a&gt;)&lt;&#x2F;p&gt;
&lt;h2 id=&quot;run&quot;&gt;Run&lt;&#x2F;h2&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$ .&#x2F;deepzoom-osd-server&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;-- missing dzi directory, creating.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;-- running on http:&#x2F;&#x2F;127.0.0.1:8080&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Now you can generate some deepzoom images and put them into the &lt;code&gt;dzi&lt;&#x2F;code&gt; directory just created.&lt;&#x2F;p&gt;
&lt;p&gt;The server will expose:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;a &lt;code&gt;&#x2F;dzi&lt;&#x2F;code&gt; endpoint to explore the content of the zip archive&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;http:&#x2F;&#x2F;localhost:8080&#x2F;dzi&#x2F;{ZIPPED-DZI}.zip&#x2F;{ZIPPED-DZI}.dzi&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;a &lt;code&gt;&#x2F;view&lt;&#x2F;code&gt; endpoint with openseadragon&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;http:&#x2F;&#x2F;localhost:8080&#x2F;view&#x2F;{ZIPPED-DZI}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;deepzoom-osd-server-screenshot.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;This is an experiment&lt;&#x2F;strong&gt;, I&#x27;m not sure if is a sane idea to use for real things, i should stress the application (with &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;httpd.apache.org&#x2F;docs&#x2F;2.2&#x2F;programs&#x2F;ab.html&quot;&gt;ab&lt;&#x2F;a&gt; or &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;tsenart&#x2F;vegeta&quot;&gt;vegeta&lt;&#x2F;a&gt;) and monitor its memory consumption.&lt;&#x2F;p&gt;
&lt;p&gt;Anyhow, if you are curating a digital library today and you need to publish big high quality images, absolutely you should look at &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;iiif.io&quot;&gt;IIIF&lt;&#x2F;a&gt; specs, and use a more
suitable server like &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;iipimage.sourceforge.net&#x2F;&quot;&gt;IIPImage&lt;&#x2F;a&gt; or &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;pulibrary&#x2F;loris&quot;&gt;Loris&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>A wayback machine (pywb) on a cheap, shared host</title> <published>2014-10-23T00:00:00+00:00</published>
                <updated>2014-10-23T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/pywb-wayback-machine/" />
                <link rel="alternate"
            href="https://literarymachin.es/pywb-wayback-machine/" type="text/html" />
                <id>https://literarymachin.es/pywb-wayback-machine/</id> <summary type="html">&lt;p&gt;For a long time the only free (I&#x27;m unaware of commercial ones) implementation of a web archival replay software has been the &lt;a href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;pywb-wayback-machine&#x2F;http=&#x2F;&#x2F;archive-access.sourceforge.net&#x2F;projects&#x2F;wayback&#x2F;&quot;&gt;Wayback Machine&lt;&#x2F;a&gt; (now &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;netpreserve.org&#x2F;openwayback&quot;&gt;Openwayback&lt;&#x2F;a&gt;). It&#x27;s a stable and mature software, with a strong community behind.&lt;br &#x2F;&gt;
To use it you need to be confident with the deploy of a java web application; not so difficult, and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;iipc&#x2F;openwayback&#x2F;wiki&quot;&gt;documentation&lt;&#x2F;a&gt; is exaustive.&lt;br &#x2F;&gt;
But there is a new player in the game, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;ikreymer&#x2F;pywb&quot;&gt;&lt;strong&gt;pywb&lt;&#x2F;strong&gt;&lt;&#x2F;a&gt;, developed by &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;webrecorder.io&quot;&gt;Ilya Kramer&lt;&#x2F;a&gt;, a former Internet Archive developer.&lt;br &#x2F;&gt;
Built in python, relatively simpler than wayback, and now used in a pro archiving project at &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;bits.blogs.nytimes.com&#x2F;2014&#x2F;10&#x2F;19&#x2F;a-new-tool-to-preserve-moments-on-the-internet&#x2F;&quot;&gt;Rhizome&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;For a long time the only free (I&#x27;m unaware of commercial ones) implementation of a web archival replay software has been the &lt;a href=&quot;https:&#x2F;&#x2F;literarymachin.es&#x2F;pywb-wayback-machine&#x2F;http=&#x2F;&#x2F;archive-access.sourceforge.net&#x2F;projects&#x2F;wayback&#x2F;&quot;&gt;Wayback Machine&lt;&#x2F;a&gt; (now &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;netpreserve.org&#x2F;openwayback&quot;&gt;Openwayback&lt;&#x2F;a&gt;). It&#x27;s a stable and mature software, with a strong community behind.&lt;br &#x2F;&gt;
To use it you need to be confident with the deploy of a java web application; not so difficult, and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;iipc&#x2F;openwayback&#x2F;wiki&quot;&gt;documentation&lt;&#x2F;a&gt; is exaustive.&lt;br &#x2F;&gt;
But there is a new player in the game, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;ikreymer&#x2F;pywb&quot;&gt;&lt;strong&gt;pywb&lt;&#x2F;strong&gt;&lt;&#x2F;a&gt;, developed by &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;webrecorder.io&quot;&gt;Ilya Kramer&lt;&#x2F;a&gt;, a former Internet Archive developer.&lt;br &#x2F;&gt;
Built in python, relatively simpler than wayback, and now used in a pro archiving project at &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;bits.blogs.nytimes.com&#x2F;2014&#x2F;10&#x2F;19&#x2F;a-new-tool-to-preserve-moments-on-the-internet&#x2F;&quot;&gt;Rhizome&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;&lt;strong&gt;Pywb&lt;&#x2F;strong&gt; simplicity and clean design make it very easy to deploy, even on shared hosts.
Nowadays seems that no one uses shared hosting anymore, virtual servers often are cheaper and with dozens of orchestration and provisioning tools it&#x27;s even easier to bootstrap a full machine.&lt;br &#x2F;&gt;
Despite this, i still prefer a shared host when allowed by the application stack: less things to worry about.&lt;&#x2F;p&gt;
&lt;p&gt;So i tried to install &lt;strong&gt;pywb&lt;&#x2F;strong&gt; on &lt;em&gt;dreamhost&lt;&#x2F;em&gt;, a well known cheap provider, offering deploy of ruby&#x2F;rack and python&#x2F;wsgi applications via mod&lt;em&gt;passenger. &lt;strong&gt;In a few minutes i can have my own wayback machine&lt;&#x2F;strong&gt;.&lt;br &#x2F;&gt;
The following steps are specific for _dreamhost&lt;&#x2F;em&gt;, but you should be able to replicate this installation inside any shared host providing deploy of python apps (fastcgi, uwsgi, passenger).&lt;&#x2F;p&gt;
&lt;h3 id=&quot;steps&quot;&gt;Steps:&lt;&#x2F;h3&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;add a new domain in your dreamhost panel, and set document root in &lt;code&gt;&#x2F;home&#x2F;{USER}&#x2F;wayback&#x2F;public&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;li&gt;
&lt;p&gt;init virtualenv&lt;&#x2F;p&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt; cd&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; ~&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; virtualenv wayback&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;ul&gt;
&lt;li&gt;create &lt;code&gt;public&lt;&#x2F;code&gt; and &lt;code&gt;tmp&lt;&#x2F;code&gt; directory for passenger, &lt;code&gt;warcs&lt;&#x2F;code&gt; to store warc files and &lt;code&gt;cdx&lt;&#x2F;code&gt; for indexes&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; mkdir&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -p&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; wayback&#x2F;{public,tmp,warcs,cdx}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;ul&gt;
&lt;li&gt;install pywb via pip (inside the virtualenv)&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; source wayback&#x2F;bin&#x2F;activate&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; pip install pywb&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;ul&gt;
&lt;li&gt;edit pywb config file &lt;code&gt;~&#x2F;wayback&#x2F;config.yaml&lt;&#x2F;code&gt; (full &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;ikreymer&#x2F;pywb&#x2F;blob&#x2F;master&#x2F;config.yaml&quot;&gt;documentation&lt;&#x2F;a&gt;)&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;yaml&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;collections&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;test&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; .&#x2F;cdx&#x2F;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;archive_paths&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; .&#x2F;warcs&#x2F;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;enable_http_proxy&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; true&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;static_routes&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;static&#x2F;default&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; pywb&#x2F;static&#x2F;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;enable_cdx_api&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; true&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;enable_memento&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; true&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;framed_replay&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; true&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;ul&gt;
&lt;li&gt;edit passenger startup file &lt;code&gt;~&#x2F;wayback&#x2F;passenger_wsgi.py&lt;&#x2F;code&gt;&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;python&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;import&lt;&#x2F;span&gt;&lt;span&gt; sys, os&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;INTERP&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; =&lt;&#x2F;span&gt;&lt;span&gt; os.path.join(os.environ[&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;HOME&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;],&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;wayback&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;bin&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; &amp;#39;python&amp;#39;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;if&lt;&#x2F;span&gt;&lt;span&gt; sys.executable&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; !=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; INTERP&lt;&#x2F;span&gt;&lt;span&gt;: os.execl(&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;INTERP&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; INTERP&lt;&#x2F;span&gt;&lt;span&gt;, \&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F44747;&quot;&gt;*sys.argv)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;sys.path.append(os.getcwd())&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;from&lt;&#x2F;span&gt;&lt;span&gt; pywb.apps.wayback&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; import&lt;&#x2F;span&gt;&lt;span&gt; application&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;ul&gt;
&lt;li&gt;put some warc files in ~&#x2F;wayback&#x2F;warcs and generate a sorted cdx&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; cdx-indexer&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; --sort&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; ~&#x2F;wayback&#x2F;cdx ~&#x2F;wayback&#x2F;warcs&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;ul&gt;
&lt;li&gt;run! &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;&quot;&gt;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;&lt;&#x2F;a&gt;&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Try to search in the &lt;code&gt;&#x2F;test&lt;&#x2F;code&gt; collection for the url &lt;em&gt;http:&#x2F;&#x2F;twitter.com&#x2F;atomotic&lt;&#x2F;em&gt;, you&#x27;ll have these &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;test&#x2F;*&#x2F;twitter.com&#x2F;atomotic&quot;&gt;results&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;And &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;www.mementoweb.org&#x2F;&quot;&gt;Memento&lt;&#x2F;a&gt; is also available:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;$  curl &amp;quot;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;test&#x2F;timemap&#x2F;*&#x2F;twitter.com&#x2F;atomotic&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;test&#x2F;timemap&#x2F;*&#x2F;http:&#x2F;&#x2F;twitter.com&#x2F;atomotic&amp;gt;; rel=&amp;quot;self&amp;quot;; type=&amp;quot;application&#x2F;link-format&amp;quot;; from=&amp;quot;Wed, 22 Oct 2014 16:30:30 GMT&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;http:&#x2F;&#x2F;twitter.com&#x2F;atomotic&amp;gt;; rel=&amp;quot;original&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;test&#x2F;http:&#x2F;&#x2F;twitter.com&#x2F;atomotic&amp;gt;; rel=&amp;quot;timegate&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;test&#x2F;20141022163030&#x2F;http:&#x2F;&#x2F;twitter.com&#x2F;atomotic&amp;gt;; rel=&amp;quot;memento&amp;quot;; datetime=&amp;quot;Wed, 22 Oct 2014 16:30:30 GMT&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;test&#x2F;20141022163031&#x2F;https:&#x2F;&#x2F;twitter.com&#x2F;atomotic&amp;gt;; rel=&amp;quot;memento&amp;quot;; datetime=&amp;quot;Wed, 22 Oct 2014 16:30:31 GMT&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;test&#x2F;20141022163042&#x2F;https:&#x2F;&#x2F;twitter.com&#x2F;atomotic&amp;gt;; rel=&amp;quot;memento&amp;quot;; datetime=&amp;quot;Wed, 22 Oct 2014 16:30:42 GMT&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;test&#x2F;20141022163355&#x2F;https:&#x2F;&#x2F;twitter.com&#x2F;atomotic&amp;gt;; rel=&amp;quot;memento&amp;quot;; datetime=&amp;quot;Wed, 22 Oct 2014 16:33:55 GMT&amp;quot;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;&amp;lt;http:&#x2F;&#x2F;wayback.literarymachin.es&#x2F;test&#x2F;20141022163710&#x2F;https:&#x2F;&#x2F;twitter.com&#x2F;atomotic&amp;gt;; rel=&amp;quot;memento&amp;quot;; datetime=&amp;quot;Wed, 22 Oct 2014 16:37:10 GMT&amp;quot;%&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Why i think this will be useful? &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;archiveteam.org&#x2F;&quot;&gt;Archiveteam&lt;&#x2F;a&gt; does a great job on running a &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;tracker.archiveteam.org&#x2F;&quot;&gt;distributed&lt;&#x2F;a&gt; crawling organization, but the publishing is still centralized at Internet Archive. What if we begin to publish thousand of small web archives, aggregating&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;blog.dshr.org&#x2F;2013&#x2F;03&#x2F;re-thinking-memento-aggregation.html&quot;&gt;[1]&lt;&#x2F;a&gt;&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;inkdroid.org&#x2F;journal&#x2F;2012&#x2F;05&#x2F;03&#x2F;way-way-back&#x2F;&quot;&gt;[2]&lt;&#x2F;a&gt;&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;inkdroid.org&#x2F;journal&#x2F;2013&#x2F;09&#x2F;30&#x2F;preserving-linked-data&#x2F;&quot;&gt;[3]&lt;&#x2F;a&gt; them with memento protocol?&lt;&#x2F;p&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>Opendata dell&#x27;Anagrafe Biblioteche</title> <published>2014-09-22T00:00:00+00:00</published>
                <updated>2014-09-22T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/opendata-anagrafe-biblioteche/" />
                <link rel="alternate"
            href="https://literarymachin.es/opendata-anagrafe-biblioteche/" type="text/html" />
                <id>https://literarymachin.es/opendata-anagrafe-biblioteche/</id> <summary type="html">&lt;p&gt;Come usare gli opendata dell&#x27;&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;anagrafe.iccu.sbn.it&#x2F;&quot;&gt;Anagrafe delle Biblioteche Italiane&lt;&#x2F;a&gt; e disegnare su una mappa web gli indirizzi delle biblioteche.&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;Come usare gli opendata dell&#x27;&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;anagrafe.iccu.sbn.it&#x2F;&quot;&gt;Anagrafe delle Biblioteche Italiane&lt;&#x2F;a&gt; e disegnare su una mappa web gli indirizzi delle biblioteche.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;Un file CSV con i &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;opendata.anagrafe.iccu.sbn.it&#x2F;territorio.zip&quot;&gt;dati anagrafici e territoriali&lt;&#x2F;a&gt; è scaricabile dalla pagina &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;anagrafe.iccu.sbn.it&#x2F;opencms&#x2F;opencms&#x2F;open_data&#x2F;&quot;&gt;opendata&lt;&#x2F;a&gt; del sito. Sono presenti altri dataset che possono essere usati per integrare la descrizione di una determinata biblioteca, in questo esempio mi limito ai dati generali e alle coordinate geografiche.
Il contenuto del csv può essere facilmente importato in un database relazionale e da lì si possono estrarne i dati di interesse.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;dinedal&#x2F;textql&quot;&gt;textql&lt;&#x2F;a&gt; è un tool estremamente utile che permette di eseguire delle query sql direttamente sul csv (in realtà dietro le quinte textql non fa altro che caricare i dati in un database sqlite temporaneo).&lt;&#x2F;p&gt;
&lt;p&gt;Con questo unico comando shell posso estrarre tutte le righe dal csv in cui il field &lt;code&gt;comune&lt;&#x2F;code&gt;=&lt;code&gt;Bologna&lt;&#x2F;code&gt; e salvare l&#x27;output risultante in un nuovo file csv.&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; textql&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -source&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; territorio.csv&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -header -dlm=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;;&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -lazy-quotes -output-header -sql=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;#39;SELECT __codice_isil_ AS ISIL, denominazione, indirizzo, telefono, email, url, REPLACE(latitudine,&amp;quot;,&amp;quot;,&amp;quot;.&amp;quot;) AS latitudine, REPLACE(longitudine,&amp;quot;,&amp;quot;,&amp;quot;.&amp;quot;) AS longitudine FROM tbl WHERE comune=&amp;quot;Bologna&amp;quot; AND (latitudine !=&amp;quot;&amp;quot; AND latitudine !=&amp;quot;0&amp;quot;) and (longitudine !=&amp;quot;&amp;quot; AND latitudine !=&amp;quot;0&amp;quot;)&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; bologna.csv&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;La query sql (di seguito più leggibile) scarta anche i campi in cui le coordinate sono vuote (o riportano 0,0) e sostuisce il separatore decimale delle coordinate da &lt;code&gt;,&lt;&#x2F;code&gt; a &lt;code&gt;.&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;sql&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;SELECT&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;\__codice_isil_ &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;AS&lt;&#x2F;span&gt;&lt;span&gt; ISIL, denominazione, indirizzo, telefono, email, &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;url&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;REPLACE&lt;&#x2F;span&gt;&lt;span&gt;(latitudine,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;,&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;.&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;) &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;AS&lt;&#x2F;span&gt;&lt;span&gt; latitudine,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;&quot;&gt;REPLACE&lt;&#x2F;span&gt;&lt;span&gt;(longitudine,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;,&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;.&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;) &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;AS&lt;&#x2F;span&gt;&lt;span&gt; longitudine&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;FROM&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;tbl&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;WHERE&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;comune&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;Bologna&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;AND&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;(latitudine &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;!=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; AND&lt;&#x2F;span&gt;&lt;span&gt; latitudine &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;!=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;0&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;AND&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;(longitudine &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;!=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; AND&lt;&#x2F;span&gt;&lt;span&gt; latitudine &lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt;!=&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;0&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Convertiamo il csv risultato in formato &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;GeoJSON&quot;&gt;GeoJSON&lt;&#x2F;a&gt; usando il tool &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;csvkit.readthedocs.org&#x2F;en&#x2F;latest&#x2F;scripts&#x2F;csvjson.html&quot;&gt;csvjson&lt;&#x2F;a&gt; dalla suite &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;csvkit.readthedocs.org&#x2F;en&#x2F;latest&#x2F;index.html&quot;&gt;csvkit&lt;&#x2F;a&gt;:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; csvjson&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; -d&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt;&amp;quot;;&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; --lat&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; latitudine&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; --lon&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; longitudine bologna.csv&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; |&lt;&#x2F;span&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt; jq&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; .&lt;&#x2F;span&gt;&lt;span style=&quot;color: #F92672;&quot;&gt; &amp;gt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; bologna.geojson&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;A questo punto il file geojson può essere usato con una qualsiasi libreria per la visualizzazione di mappe (esempio &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;leafletjs.com&#x2F;examples&#x2F;geojson.html&quot;&gt;Leaflet&lt;&#x2F;a&gt;) oppure per una visualizzazione immediata può essere caricato in un &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;defunkt.io&#x2F;gist&#x2F;&quot;&gt;gist&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;shellscript&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;$&lt;&#x2F;span&gt;&lt;span style=&quot;color: #E6DB74;&quot;&gt; gist bologna.geojson&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #A6E22E;&quot;&gt;https:&#x2F;&#x2F;gist.github.com&#x2F;9d4ed56efcf4f9fc2c61&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;il gist è subito visualizzabile e riporta una mappa navigabile (su un layer mapbox) con i punti delle nostre biblioteche (nell&#x27;esempio quelle del comune di Bologna):
&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;gist.github.com&#x2F;9d4ed56efcf4f9fc2c61&quot;&gt;https:&#x2F;&#x2F;gist.github.com&#x2F;9d4ed56efcf4f9fc2c61&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;script src=&quot;https:&#x2F;&#x2F;gist.github.com&#x2F;atomotic&#x2F;9d4ed56efcf4f9fc2c61.js&quot;&gt;&lt;&#x2F;script&gt;
</content>
    </entry> <entry xml:lang="en">
        <title>API json dell&#x27;opac SBN</title> <published>2014-09-05T00:00:00+00:00</published>
                <updated>2014-09-05T00:00:00+00:00</updated>
                <link href="https://literarymachin.es/sbn-json-api/" />
                <link rel="alternate"
            href="https://literarymachin.es/sbn-json-api/" type="text/html" />
                <id>https://literarymachin.es/sbn-json-api/</id> <summary type="html">&lt;p&gt;Alcuni mesi fa è stata rilasciata da ICCU una &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;play.google.com&#x2F;store&#x2F;apps&#x2F;details?id=it.inera.opacmobile&quot;&gt;app mobile&lt;&#x2F;a&gt; per consultare l&#x27;&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;opac.sbn.it&quot;&gt;OPAC SBN&lt;&#x2F;a&gt;.
Anche se graficamente poco accattivante l&#x27;app funziona bene, e trovo molto utili le funzioni di ricerca di un libro scansionando il codice a barre con la camera del telefonino, e la possibilità di bookmarkare dei preferiti.&lt;br &#x2F;&gt;
Incuriosito dal funzionamento ho pensato di analizzarne il traffico http.&lt;&#x2F;p&gt;</summary> <content
            type="html">&lt;p&gt;Alcuni mesi fa è stata rilasciata da ICCU una &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;play.google.com&#x2F;store&#x2F;apps&#x2F;details?id=it.inera.opacmobile&quot;&gt;app mobile&lt;&#x2F;a&gt; per consultare l&#x27;&lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;opac.sbn.it&quot;&gt;OPAC SBN&lt;&#x2F;a&gt;.
Anche se graficamente poco accattivante l&#x27;app funziona bene, e trovo molto utili le funzioni di ricerca di un libro scansionando il codice a barre con la camera del telefonino, e la possibilità di bookmarkare dei preferiti.&lt;br &#x2F;&gt;
Incuriosito dal funzionamento ho pensato di analizzarne il traffico http.&lt;&#x2F;p&gt;
&lt;span id=&quot;continue-reading&quot;&gt;&lt;&#x2F;span&gt;
&lt;p&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;sbn-mobile-screenshot-2.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Con &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;mitmproxy.org&#x2F;&quot;&gt;mitmproxy&lt;&#x2F;a&gt; in esecuzione sul laptop ho settato il device android&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;Settings &amp;gt; Wi-Fi &amp;gt; Modify Network &amp;gt; Show advanced options &amp;gt; Proxy: Manual&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;configurando &lt;code&gt;Proxy hostname&lt;&#x2F;code&gt; e &lt;code&gt;Proxy port&lt;&#x2F;code&gt; con l&#x27;indirizzo del laptop nella mia rete e la porta &lt;code&gt;:8080&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Effettuando delle operazioni di ricerca con la app ho potuto ispezionarne il traffico http, vedendo del traffico di dati json verso l&#x27;endpoint &lt;code&gt;http:&#x2F;&#x2F;opac.sbn.it&#x2F;opacmobilegw&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;p&gt;&lt;img src=&quot;&#x2F;assets&#x2F;images&#x2F;mitmproxy.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Qui alcune delle API principali:&lt;&#x2F;p&gt;
&lt;h2 id=&quot;ricerca-libera&quot;&gt;Ricerca libera&lt;&#x2F;h2&gt;
&lt;p&gt;URL: &lt;code&gt;http:&#x2F;&#x2F;opac.sbn.it&#x2F;opacmobilegw&#x2F;search.json?any={STRING}&amp;amp;type=0&amp;amp;start=0&amp;amp;rows=3&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Ricerca una &lt;code&gt;{STRING}&lt;&#x2F;code&gt; nell&#x27;intero catalogo (paginando i risultati con i parametri &lt;code&gt;start&lt;&#x2F;code&gt; e &lt;code&gt;row&lt;&#x2F;code&gt;), avendo come risposta una serie di record nel seguente formato (oltre ad altre informazioni interessanti, come faccette e soggetti).&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;json&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;{&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;autorePrincipale&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Comici, Emilio&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;citazioni&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;codiceIdentificativo&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\\&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;ICCU&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\\&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;RAV&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\\&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;2002745&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;livello&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Monografia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;localizzazioni&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;luogoNormalizzato&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;nomi&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;note&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;numeri&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;progressivoId&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 0&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;pubblicazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Milano : Corriere Della Sera, 2014&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;tipo&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Testo a stampa&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;titolo&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Alpinismo eroico &#x2F; Emilio Comici&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h2 id=&quot;ricerca-per-isbn&quot;&gt;Ricerca per ISBN&lt;&#x2F;h2&gt;
&lt;p&gt;URL: &lt;code&gt;http:&#x2F;&#x2F;opac.sbn.it&#x2F;opacmobilegw&#x2F;search.json?isbn={ISBN}&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Esempio: &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;opac.sbn.it&#x2F;opacmobilegw&#x2F;search.json?isbn=9788842092995&quot;&gt;&#x2F;search.json?isbn=9788842092995&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;h2 id=&quot;metadati-di-un-singolo-record-bid&quot;&gt;Metadati di un singolo record (BID)&lt;&#x2F;h2&gt;
&lt;p&gt;URL: &lt;code&gt;http:&#x2F;&#x2F;opac.sbn.it&#x2F;opacmobilegw&#x2F;full.json?bid={BID}&lt;&#x2F;code&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Effettuando la chiamata con un &lt;strong&gt;BID&lt;&#x2F;strong&gt; (dalla risposta precedente &lt;code&gt;&quot;codiceIdentificativo&quot;: &quot;IT\\ICCU\\RAV\\2002745&quot;&lt;&#x2F;code&gt; - nota: i backslash sono singoli) si ottiene un record json con i metadati del libro, corredati delle localizzazioni (le biblioteche che lo possiedono) complete di coordinate geografiche (quindi pronte per essere visualizzate su una mappa).&lt;&#x2F;p&gt;
&lt;p&gt;esempio: &lt;a rel=&quot;external&quot; href=&quot;http:&#x2F;&#x2F;opac.sbn.it&#x2F;opacmobilegw&#x2F;full.json?bid=IT%5CICCU%5CRAV%5C2002745&quot;&gt;&#x2F;full.json?bid=IT\ICCU\RAV\2002745&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Risposta ottenuta:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color: #F8F8F2; background-color: #272822;&quot;&gt;&lt;code data-lang=&quot;json&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;{&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;autorePrincipale&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Comici, Emilio&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;citazioni&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;standard&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;mla&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;valore&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Comici, Emilio. Alpinismo eroico Milano Corriere Della Sera, 2014&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    },&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;standard&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;apa&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;valore&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Comici, E. (2014). Alpinismo eroico Milano Corriere Della Sera.&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    }&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  ],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;codiceIdentificativo&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\\&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;ICCU&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\\&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;RAV&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt;\\&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;2002745&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;collezione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Biblioteca della montagna ; 8&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;descrizioneFisica&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;170 p. ; 19 cm&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;linguaPubblicazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;ITALIANO&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;livello&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Monografia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;localizzazioni&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;comune&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Canale d&amp;#39;Agordo&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;denominazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;BIBLIOTECA COMUNALE DI CANALE D&amp;#39;AGORDO&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;isil&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT-BL0089&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;latitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 46.3606418&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;longitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 11.9148422&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;provincia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;BL&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;sbn&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;VIACQ&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    },&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;comune&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Associazione Italiana Cultura Sport&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;denominazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Biblioteca del Centro Informazione Documentazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;isil&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT-BO0630&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;latitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 44.4769143&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;longitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 11.4094361&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;provincia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;CID-AICS&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;sbn&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;UBOXA&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    },&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;comune&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Firenze&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;denominazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Biblioteca delle Oblate&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;isil&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT-FI0104&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;latitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 43.772209&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;longitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 11.2600206&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;provincia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;FI&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;sbn&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;RT1AA&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    },&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;comune&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Latina&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;denominazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Biblioteca comunale Aldo Manuzio&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;isil&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT-LT0048&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;latitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 41.4675967&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;longitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 12.9037&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;provincia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;LT&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;sbn&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;RMSA2&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    },&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;comune&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Milano&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;denominazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Biblioteca nazionale Braidense&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;isil&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT-MI0185&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;latitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 45.471946&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;longitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 9.187845&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;provincia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;MI&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;sbn&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;MILNB&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    },&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;comune&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Rimini&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;denominazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Biblioteca civica Gambalunga&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;isil&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT-RN0013&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;latitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 44.0616558&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;longitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 12.5678351&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;provincia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;RN&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;sbn&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;RAVRI&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    },&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;comune&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Pecetto Torinese&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;denominazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Biblioteca civica&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;isil&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT-TO0152&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;latitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 45.0170177&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;longitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 7.7491581&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;provincia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;TO&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;sbn&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;TO13T&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    },&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;comune&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Trieste&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;denominazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Biblioteca comunale Stelio Mattioni&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;isil&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT-TS0268&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;latitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 45.6164974&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;longitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 13.8230741&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;provincia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;TS&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;sbn&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;TSAU2&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    },&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;comune&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Iesolo&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;denominazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;BIBLIOTECA CIVICA DI JESOLO&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;isil&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;IT-VE0124&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;latitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 45.5367875&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;longitudine&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #AE81FF;&quot;&gt; 12.6391389&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;provincia&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;VE&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;      &amp;quot;sbn&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;VIAVJ&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;    }&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;  ],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;luogoNormalizzato&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;nomi&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;&amp;quot;Comici, Emilio&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;note&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt;&amp;quot;Edizione speciale per Corriere della Sera.&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;numeri&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;: [],&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;paesePubblicazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;ITALIA&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;pubblicazione&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Milano : Corriere Della Sera, 2014&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;tipo&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Testo a stampa&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: #66D9EF;font-style: italic;&quot;&gt;  &amp;quot;titolo&amp;quot;&lt;&#x2F;span&gt;&lt;span&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: #CFCFC2;&quot;&gt; &amp;quot;Alpinismo eroico &#x2F; Emilio Comici&amp;quot;&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h1 id=&quot;disclaimer&quot;&gt;Disclaimer&lt;&#x2F;h1&gt;
&lt;p&gt;Queste API non sono documentate pubblicamente, per cui potrebbero cambiare. E non sono noti nemmeno i termini di utilizzo, per cui non sono certo che si possano usare liberamente per costruirci sopra applicazioni esterne.
Se volete sperimentarle in ogni caso evitate operazioni di scraping selvaggio, ed esprimete con una mail all&#x27;&lt;a href=&quot;mailto:opac.contatti@iccu.sbn.it&quot;&gt;ICCU&lt;&#x2F;a&gt; il desiderio di vedere questo tipo di servizi resi pubblici, documentati, con opportune licenze d&#x27;uso aperte.&lt;&#x2F;p&gt;
</content>
    </entry> </feed>