Wednesday, November 13, 2013

How to batch change image dpi resolution with imagemagick convert and bash

It is often useful to manipulate a lot (a lot!!!) of images the batch way. For today homework we are going to use imagemagick convert, a wonderful little tools, along with bash. In our case we are going to take a bunch of *.png images, and to change their resolution to 600 dpi. For this we are going to use a bash loop plus imagemagick convert with the -unit PixelPerInch and -density 600 options. The command is as simple as:

for f in path/from/*.png; \ 
do convert -units PixelsPerInch $f -density 600 path/to/$f ; done

Of course the same command line can be used to do much more than changing the resolution by feeding different options to the convert command line utility.

Fast Import ASCII Files in Python (and Numpy)

In my line of research (computational electrodynamics), I often have to manipulate quite large ASCII files, in the range of 10-100MB of size. These files usually contain mappings of electromagnetic fields, and all the information is simply stored column-wise in ASCII format. I know very well that there are smarter ways to store this kind of information (HDF5 format to mention one of the many), but sometimes, either for laziness or to avoid linking one more library, it is simpler to save everything to an ASCII file, and to gzip it to save some room on the hard drive. When the time for data analysis and plotting comes, I love to use Numpy and Matplotlib, and loadtxt() is an excellent tool to efficiently load small ASCII files. Nevertheless the process of reading files several million lines long becomes cumbersome, and a different approach to solve this speed (and size) issue is needed. Long story short, let us say that we have a bunch of ASCII *.gz files in a given folder, that this data are n_col columns wide and several million lines long, this snippet does the trick to import them all, unzip them, read and format them and store them in a python list of numpy arrays. First of all we need to import some libraries:

import numpy as np
import scipy as sp
import glob as glob

then for the useful part:

data_files=glob.glob("path/to/folder/*.gz"); #read in the file names in folder
data_list=[];   #list for imported data storage
for file in data_files:
    gunzipped_file=gzip.open(file,"r")               #unzip
    data=np.float32(gunzipped_file.read().split());  #read data
    data=data.reshape(data.size/n_col,n_col)         #reshape data 
    data_list.append(data);                          #storing in list

This is it, and it is fast.