Fork me on GitHub
03 Oct 2009

Executable Shared Libraries

I figured out a way to create a shared library that is also executable. One possible use for this would be to make Cython modules that can be run by themselves or imported from Python. I found some mailing list discussions from 2003 describing a method, but it does not work on x86 Linux. The method below works, at least, on x86-64 Linux with gcc 4.3.3 on Ubuntu 9.04.

Unfortunately, I could not get gcc to do this for me - I have to run the linker myself. I tried giving -Wl,-shared, but gcc put the flag too late in the command line. If anyone figures out a way to do this, let me know!

Update: Thanks to Daniel Jacobowitz, I found out that -fPIC -fPIE -pie is sufficient. Easy! I’ve updated the post bleow to reflect this.

Here’s some example code:

// library.c
#include <stdio.h>
#include <stdlib.h>

void foo(int x) {
    printf("foo(%d)\n", x);

int main(int argc, char **argv) {
    if (argc != 2) {
        fprintf(stderr, "USAGE: %s n\n", argv[0]);
    return 0;

Compile -fPIC -fPIE -pie to make a position-independent executable, which can be loaded like a shared library.

$ gcc -fPIC -fPIE -pic -o library.c

Now you can both execute and load your shared library.

$ ./ 12

$ python -c \
    'import ctypes; x = ctypes.cdll.LoadLibrary("./");'

Manual Method

The rest of this post describes the stupid way I did it before, mainly for historical interest.

Run gcc '-###' to find out what command it would have run.

$ gcc '-###' -fPIC -fPIE -pie -rdynamic -o library.o

Look for the collect2 line - this is the linker. Run that exact command, but insert -shared before the -pie. On my system, I ran this:

$ /usr/lib/gcc/x86_64-linux-gnu/4.3.3/collect2 \
    --eh-frame-hdr \
    -m elf_x86_64 \
    -shared \
    -pie \
    --export-dynamic \
    -o \
    -z relro \
    /usr/lib/Scrt1.o \
    /usr/lib/crti.o \
    /usr/lib/gcc/x86_64-linux-gnu/4.3.3/crtbeginS.o \
    -L/usr/lib/gcc/x86_64-linux-gnu/4.3.3 \
    -L/usr/lib \
    -L/lib \
    library.o \
    -lgcc \
    --as-needed \
    -lgcc_s \
    --no-as-needed \
    -lc \
    -lgcc \
    --as-needed \
    -lgcc_s \
    --no-as-needed \
    /usr/lib/gcc/x86_64-linux-gnu/4.3.3/crtendS.o \

30 Aug 2009

NumPy, SciPy, and the Intel Compiler Suite

I spent a lot of time trying to get NumPy and SciPy to work with the Intel Compiler Suite, but it finally works. Here is how I did it on Ubuntu 9.04/AMD64 on an Intel Core 2 Duo using the versions specified. It appears that every combination of versions requires a different process, so hopefully this works for you!

Intel Compiler Suite 11.1.046

Download both the Intel C++ and Fortran compilers for Linux, which are free for non-commercial use. I used Intel 64 version 11.1.046. The install is very easy; I installed the C++ compiler and then the Fortran one.

Once they are installed, source the script.

source /opt/intel/Compiler/11.1/046/bin/ intel64

Also, you may want to do the following so that you do not need to have the Intel library directories in your $LD_LIBRARY_PATH.

cat > /etc/ <<EOF
sudo /sbin/ldconfig

Note About -xHost

In recent Intel compilers, you can give the -xHost option, which will optimize based on the Intel processor of the current host. This works great if you are compiling with an Intel processor and you will run on the same machine. This was my case, so this is the flag I used. If this is not the case (AMD processor, or compiling to run on other chips), then you will want to set this optimization flag to something more appropriate.

AMD / UMFPACK, part of SuiteSparse 3.4.0

The AMD and UMFPACK libraries (part of SuiteSparse) are needed for SciPy, so we’ll build them first. They are static libraries, so I chose to build them in /opt.

cd /opt
tar zxvf SuiteSparse-3.4.0.tar.gz
cd SuiteSparse

First we need to patch the Makefile to build with the Intel compiler. If you want to compile the whole suite, you also need to set up BLAS and LAPACK to use MKL. But we don’t need the whole suite for SciPy.

patch -p0 <<EOF
diff -ur SuiteSparse.orig/UFconfig/ SuiteSparse/UFconfig/
--- SuiteSparse.orig/UFconfig/	2009-05-20 14:06:04.000000000 -0400
+++ SuiteSparse/UFconfig/	2009-08-07 02:03:19.000000000 -0400
@@ -33,11 +33,11 @@
 # C compiler and compiler flags:  These will normally not give you optimal
 # performance.  You should select the optimization parameters that are best
 # for your system.  On Linux, use "CFLAGS = -O3 -fexceptions" for example.
-CC = cc
+CC = icc
 # CFLAGS = -O   (for example; see below for details)

 # C++ compiler (also uses CFLAGS)

 # ranlib, and ar, for generating libraries
 RANLIB = ranlib
@@ -48,8 +48,8 @@
 MV = mv -f

 # Fortran compiler (not normally required)
-F77 = f77
-F77FLAGS = -O
+F77 = ifort
+F77FLAGS = -O3 -xHost
 F77LIB =

 # C and Fortran libraries
@@ -220,7 +224,7 @@

 # Using default compilers:
 # CC = gcc
-CFLAGS = -O3 -fexceptions
+CFLAGS = -O3 -xHost -fPIC -openmp -vec_report=0

 # alternatives:
 # CFLAGS = -g -fexceptions \

Compile both libraries. No install needed after this.

make -C AMD

Numpy 1.3.0

Now on to NumPy.

tar zxvf numpy-1.3.0.tar.gz
cd numpy-1.3.0

Next, set up the site.cfg file for Intel MKL, AMD, and UMFPACK. (Only MKL is needed for NumPy, but if we set this up now, it will work for SciPy later.)

Warning: I used mkl_mc because I have a Core architecture. Read the MKL User’s Guide to determine which kernel library to use for your architecture. It should be automatic (from mkl_core), but for some reason it kept failing with undefined symbol: mkl_dft_commit_descriptor_s_c2c_md_omp. Adding mkl_mc fixed it.

cat > site.cfg <<EOF
include_dirs = /opt/SuiteSparse/UFconfig

amd_libs = amd
library_dirs = /opt/SuiteSparse/AMD/Lib
include_dirs = /opt/SuiteSparse/AMD/Include

umfpack_libs = umfpack
library_dirs = /opt/SuiteSparse/UMFPACK/Lib
include_dirs = /opt/SuiteSparse/UMFPACK/Include

include_dirs = /opt/intel/Compiler/11.1/046/mkl/include
library_dirs = /opt/intel/Compiler/11.1/046/mkl/lib/em64t
lapack_libs = mkl_lapack
mkl_libs = mkl_intel_lp64, mkl_intel_thread, mkl_core, mkl_mc

Now, turn on optimization, -fPIC, and OpenMP support for icc.

patch -p0 <<EOF
--- numpy/distutils/       2009-03-29 07:24:21.000000000 -0400
+++ numpy/distutils/   2009-08-05 23:58:30.000000000 -0400
@@ -8,7 +8,7 @@

     compiler_type = 'intel'
-    cc_exe = 'icc'
+    cc_exe = 'icc -xHost -O3 -fPIC -openmp'

     def __init__ (self, verbose=0, dry_run=0, force=0):
         UnixCCompiler.__init__ (self, verbose,dry_run, force)

Set the appropriate flags for ifort, skipping Numpy’s broken autodetection.

patch -p0 <<EOF
--- numpy/distutils/fcompiler/      2009-03-29 07:24:21.000000000 -0400
+++ numpy/distutils/fcompiler/  2009-08-06 23:08:59.000000000 -0400
@@ -47,6 +47,7 @@
     module_include_switch = '-I'

     def get_flags(self):
+        return ['-fPIC', '-cm']
         v = self.get_version()
         if v >= '10.0':
             # Use -fPIC instead of -KPIC.
@@ -63,6 +64,7 @@
         return ['-O3','-unroll']

     def get_flags_arch(self):
+        return ['-xHost']
         v = self.get_version()
         opt = []
         if cpu.has_fdiv_bug():

Build using icc. You have to add the build_src to fix a bug in the Numpy 1.3.0 distribution.

python build_src config --compiler=intel build_clib \
    --compiler=intel build_ext --compiler=intel

If you want to install and test without installing to the system,

python install --prefix=$PWD/d
PYTHONPATH=$PWD/d/lib/python-2.6/site-packages \
    (cd $HOME && python -c 'import scipy; scipy.test()')

Install as root.

sudo python install
sudo cp site.cfg /usr/local/lib/python2.6/dist-packages/numpy

Test (must do this outside the numpy source directory).

(cd $HOME && python -c 'import numpy; numpy.test()')

SciPy 0.7.1

Now, on to SciPy.

tar zxvf scipy-0.7.1.tar.gz
cd scipy-0.7.1

Fix compilation with icc.

patch -p0 <<EOF
--- scipy/special/cephes/const.c.bak    2009-08-07 01:56:43.000000000 -0400
+++ scipy/special/cephes/const.c        2009-08-07 01:57:08.000000000 -0400
@@ -91,12 +91,12 @@
 double THPIO4 =  2.35619449019234492885;       /* 3*pi/4 */
 double TWOOPI =  6.36619772367581343075535E-1; /* 2/pi */
-double INFINITY = 1.0/0.0;  /* 99e999; */
+double INFINITY = __builtin_inff();
 double INFINITY =  1.79769313486231570815E308;    /* 2**1024*(1-MACHEP) */
 #ifdef NANS
-double NAN = 1.0/0.0 - 1.0/0.0;
+double NAN = __builtin_nanf("");
 double NAN = 0.0;

Build. Note that you have to set fcompiler=intelem for Intel 64.

python config --compiler=intel --fcompiler=intelem build_clib \
    --compiler=intel --fcompiler=intelem build_ext --compiler=intel \
    --fcompiler=intelem -I/opt/SuiteSparse/UFconfig

The SWIG code generates C++, but distutils doesn’t use the C++ compiler to link, so we have to do this ourselves. There may have been a way to fix some or something, but it’s easier to just do this by hand.

for x in csr csc coo bsr dia; do
    icpc -xHost -O3 -fPIC -shared \
        build/temp.linux-x86_64-2.6/scipy/sparse/sparsetools/${x}_wrap.o \
        -o build/lib.linux-x86_64-2.6/scipy/sparse/sparsetools/_${x}.so
icpc -xHost -O3 -fPIC -openmp -shared \
    build/temp.linux-x86_64-2.6/scipy/interpolate/src/_interpolate.o \
    -o build/lib.linux-x86_64-2.6/scipy/interpolate/

As with numpy, you may want to install to $PWD/d and test first.

python install --prefix=$PWD/d
PYTHONPATH=$PWD/d/lib/python-2.6/site-packages \
    (cd $HOME && python -c 'import scipy; scipy.test()')

Install as root.

sudo python install

Test (again, outside source directory.)

(cd $HOME && python -c 'import scipy; scipy.test()')

Newer →