Beyond HTTP APIs: the case for database dumps in Cultural Heritage

In the realm of cultural heritage, we're not just developing websites; we're creating data platforms. One of the primary missions of cultural institutions is to make data (both metadata and digital content) freely available on the web. This data should come with appropriate usage licenses and in suitable formats to facilitate interoperability and content sharing.

Read more...

Building a simple IIIF digital library with Tropy, Tropiiify and Canopy

Creating and maintaining an online digital collection can be a complex process involving multiple components, from organizational procedures to software solutions. With many moving parts, it's no surprise that building and curating a digital collection can be costly, time-consuming, and demanding to maintain. When dealing with cultural heritage, maintenance and long-term preservation should be our primary concerns. The approach we should always consider is minimal computing.

Read more...

Revamping IIIF.link

A few years ago, I had developed a small application that allowed you to "frame" a specific part of an IIIF image and share it on the web through simple, concise URLs. But the initial version was rudimentary and only supported IIIF 2, I've since revamped it using the latest release of the TIFY viewer.

Read more...

ArchivIIIFy

A short guide to download digitized books from Internet Archive and rehost on your own infrastructure using IIIF with full-text search.

Read more...

Pywb 2.0 - docker quickstart

Four years have passed since i first wrote of pywb: it was a young tool at the time, but already usable and extremely simple to deploy. Since then a lot of works has been done by Ilya Kreymer (and others), resulting in all the new features available with the 2.0 release.

Read more...

Anonymous webarchiving

Webarchiving activities, as any other activity where an HTTP client is involved, leave marks of their steps: the web server you are visiting or crawling will save your IP address in its logs (or even worse it can decide to ban your IP). This is usually not a problem, there are plenty of good reasons for a webserver to keep logs of its visitors.
But sometimes you may need to protect your own identity when you are visiting or saving something from a website, and there a lot of sensitive careers that need this protection: activists, journalist, political dissidents.
TOR has been invented for this, and today offer a good protection to browse anonymously the web.
Can we also archive the web through TOR?

Read more...

Open BNI

Il 30 maggio 2016 viene annunciato il rilascio libero della Bibliografia Nazionale Italiana (BNI). Viene apprezzata l'apertura di questo catalogo (anche se con i limiti dei soli pdf), e da profano di biblioteconomia faccio anche una domanda sull'effettivo caso d'uso della BNI.
Il 30 agosto 2016 viene annunciato il rilascio delle annate 2015 e 2016 anche in formato UNIMARC e MARCXML.
Incuriosito dal catalogo inizio ad esplorarlo, per pensare a possibili trasformazioni (triple rdf) o arricchimenti con/verso altri dati (wikidata).

Read more...

Epub linkrot

Linkrot also affects epub files (who would have thought! :)).
How to check the health of external links in epub books (required tools: a shell, atool, pup, gnu parallel).

Read more...

SKOS Nuovo Soggettario, api e autocomplete

Come creare una api per un form con autocompletamento usando i termini del Nuovo Soggettario, con i Sorted Sets di Redis e Nginx+Lua.

Read more...

Serve deepzoom images from a zip archive with openseadragon

vips is a fast image processing system. Version higher than 7.40 can generate static tiles of big images in deepzoom format, saving them directly into a zip archive.

Read more...

A wayback machine (pywb) on a cheap, shared host

For a long time the only free (I'm unaware of commercial ones) implementation of a web archival replay software has been the Wayback Machine (now Openwayback). It's a stable and mature software, with a strong community behind.
To use it you need to be confident with the deploy of a java web application; not so difficult, and documentation is exaustive.
But there is a new player in the game, pywb, developed by Ilya Kramer, a former Internet Archive developer.
Built in python, relatively simpler than wayback, and now used in a pro archiving project at Rhizome.

Read more...

Opendata dell'Anagrafe Biblioteche

Come usare gli opendata dell'Anagrafe delle Biblioteche Italiane e disegnare su una mappa web gli indirizzi delle biblioteche.

Read more...

API json dell'opac SBN

Alcuni mesi fa è stata rilasciata da ICCU una app mobile per consultare l'OPAC SBN. Anche se graficamente poco accattivante l'app funziona bene, e trovo molto utili le funzioni di ricerca di un libro scansionando il codice a barre con la camera del telefonino, e la possibilità di bookmarkare dei preferiti.
Incuriosito dal funzionamento ho pensato di analizzarne il traffico http.

Read more...