Validating Subject Terms Against the AGROVOC REST API

AGROVOC is a controlled vocabulary covering all areas of interest of the Food and Agriculture Organization (FAO) of the United Nations, including food, nutrition, agriculture, fisheries, forestry, environment etc. It is published by FAO and edited by a community of experts ¹. At the time of this writing AGROVOC consists of over 36,000 concepts and is […]

Migrate GlusterFS to a New Operating System

During an infrastructure upgrade earlier this year I discovered that the in-place upgrade path between CentOS 6 and CentOS 7 is nonexistent — or at least not advised. Suddenly I found myself stuck with the unexpected task of re-provisioning a cluster of GlusterFS storage servers from scratch. To my relief the process was straightforward and, due to […]

Cache Maven Artifacts With Artifactory

Anyone who has worked with a Java-based project has noticed the tendency of build systems like Maven and Gradle to seemingly “download the Internet” during compilation. The effect is magnified if your workflow uses containers because build artifacts are, by definition, removed after the build process completes. Developers get tired of this waste of time […]

Rate Limiting Baiduspider Using nginx

The Baidu search engine has a voracious appetite for content and crawls one of my sites aggressively. It’s bad enough having to deal with load generated by bots from large technology companies with vast resources, but it’s another thing entirely when those bots crawl from dozens of IP addresses simultaneously and routinely browse thousands of […]