Xyne's Forum

Ignore the dust.

You are not logged in.

#1 2013-01-08 23:47:56

fhr
Member
Registered: 2013-01-08
Posts: 8

Linear algebra

I propose to start a list of linear algebra related software that could be packaged/updated. I mainly know sparse direct solvers so the starting point of the list will be biased smile

Building blocks (dense linear algebra, used by bigger packages):

  • BLAS (basic dense linear algebra, e.g. matrix-vector product): this is needed by a lot of packages and crucial for performance but it's quite messy because there are different implementations/approaches. In AUR, we have ACML, ATLAS, GOTOBLAS, OPENBLAS. In the official repos, there's a blas package which is probably built using the reference implementation (so super low performance).

  • LAPACK ("advanced" dense linear algebra, e.g. factorizations): there's a package in the official repo. There's also a package in AUR, that uses the ATLAS BLAS.

  • ScaLAPACK (parallel dense linear algebra): there's a package in AUR that uses the ATLAS LAPACK/BLAS mentioned above. Scalapack depends on MPI (more specifically, one implementation of MPI) but the AUR package doesnt depend on anything, which is bad.

Ordering/graph partitioning (used by sparse linear algebra codes):

  • METIS/ParMETIS (sequential/parallel graph partitioner): there's packages in AUR, including two parallel versions (one depends on OpenMPI from [extra], the other one on MPICH2). They look up-to-date.

  • Scotch/PT-Scotch (sequential/parallel graph partitioner): there's packages in AUR, the parallel one depends on OpenMPI. The sequential package is outdated.

Sparse linear algebra:

  • MUMPS (parallel sparse direct solver): there's a package in AUR, built against ParMETIS and PT-Scotch from AUR and mpich2. It's outdated.

  • SuperLU (parallel sparse direct solver): comes in 3 flavors (SuperLU, sequential; SuperLU_MT, shared memory; SuperLU_DIST, distributed memory. There are some packages on AUR for SuperLU and SuperLU_MT.

  • Spooles (sparse direct solver): there's a package in AUR.

  • UMFPACK (sparse direct solver): apparently there was a package at some point. UMFPACK is accessible through SuiteSparse (see below) but it could make sense to have a standalone package.

  • ARPACK (eigensolver): there's a package in AUR. It depends on openmpi.

"Meta packages" (libraries that provide interfaces to a lot of stuff)

  • PETSC: there's a package on AUR but some dependencies are broken.

  • Trilinos: there's a package on AUR.

  • SuiteSparse (sparse matrix tools, ordering algorithms, direct solvers): there's a package in [extra], but it doesn't depend on METIS which is not great.


For the building blocks, I guess all we need to do is to check that the current PKGBUILDs are fine. In particular, we need to check that the "provides" fields are ok. I have a question though: say a user tries to install lapack via yaourt or another AUR frontend that check dependencies; will the default BLAS package from [extra] be selected? If yes, this is bad performance-wise.

For the sparse linear algebra packages, I see several problems. Take the example of MUMPS: it can be built in parallel mode or in sequential mode (by linking against a dummy MPI provided in the package), and can be optionnally be hooked to Scotch/PT-Scotch and/or METIS/ParMETIS. That's a lot of combinations; what's the spirit here? Forcing the user to have the whole thing (parallel version with all the packages mentioned before), or also providing a minimal version (sequential, almost standalone)?

There's the same issue with what I call meta-packages; they can be built against few or zillions of things...

Offline

#2 2013-01-09 00:48:37

gborzi
Member
Registered: 2013-01-05
Posts: 5

Re: Linear algebra

Hi fhr,
I'll contribute as well to the Linear algebra discussion.
Building blocks: actually, gotoblas is no longer in development, openblas is its successor. The current package is multi-threaded with OpenMP. It is possible to build single-threaded openblas, multi-threaded but without OpenMP (and incompatible with it) and it is even possible to compile lapack along with it, like atlas-lapack. One thing we should discuss, IMHO, is which version(s) can be of interest. To my limited experience, which "fast BLAS" package works better is machine-dependent, although openblas is one of the best.
Sparse linear algebra: SuperLU_MT seems no longer in development. I've not packaged SuperLU_DIST because I don't have access to a distributed memory machine, so cannot test it. Spooles includes an MPI version, which is not compiled in my package because I can't test it. Spooles development stopped in 1999, but it works very well and does not require lots of dependencies. For UMFPACK we can make a "suitesparse-dyn" package that offers dynamic libraries instead of the static ones of the package in extra, and links with metis. ARPACK is already in community and its C++ interface (arpack++) is already an AUR package. The only thing currently not working in arpack++ is the umfpack interface.
Re. your question about installation, I think it depends on the AUR frontend. The frontend I use, aurget, doesn't install non-AUR packages.
Re. "Parallel or serial", I think that for sparse solvers at a minimum the shared-memory version should be available. But, ideally, I would include the MPI version if it can be tested.

Offline

#3 2013-01-09 07:01:54

fhr
Member
Registered: 2013-01-08
Posts: 8

Re: Linear algebra

gborzi wrote:

Hi fhr,
Building blocks: actually, gotoblas is no longer in development, openblas is its successor. The current package is multi-threaded with OpenMP. It is possible to build single-threaded openblas, multi-threaded but without OpenMP (and incompatible with it) and it is even possible to compile lapack along with it, like atlas-lapack. One thing we should discuss, IMHO, is which version(s) can be of interest.

Yes, openblas sounds good. What I would like is a multithreaded version that defaults to 1 thread when no specific variable (some *_NUM_THREADS) is set, unlike these libs that automatically eat up all your cores.

gborzi wrote:

To my limited experience, which "fast BLAS" package works better is machine-dependent, although openblas is one of the best.

That's where ABS is nice, since openblas applies a lot of hardware-dependent tricks at compile time. Providing binaries of openBLAS (or worse, ATLAS, that relies on tuning optimization) doesn't make much sense to me.

gborzi wrote:

Sparse linear algebra: SuperLU_MT seems no longer in development. I've not packaged SuperLU_DIST because I don't have access to a distributed memory machine, so cannot test it.

An interesting point is that the 3 flavors of SuperLU relie on different algorithms, so even on one thread/process, they won't behave the same; so providing 3 different packages would make sense. For MUMPS, I would vote for a parallel build (i.e., depending on a true MPI), since installing MPI is not a great deal (it depends on almost nothing and takes maybe 20MB once installed)

gborzi wrote:

Re. your question about installation, I think it depends on the AUR frontend. The frontend I use, aurget, doesn't install non-AUR packages.

So, could we maybe change the dependencies so that the packages rely on, let's say, "fastblas" (that would be provided by openBLAS or others)?

PS: I think I saw you at a conf (but we don't know each other); at least, I saw a talk by your homonymous (A.) smile

Offline

#4 2013-01-09 12:56:26

gborzi
Member
Registered: 2013-01-05
Posts: 5

Re: Linear algebra

fhr wrote:

Yes, openblas sounds good. What I would like is a multithreaded version that defaults to 1 thread when no specific variable (some *_NUM_THREADS) is set, unlike these libs that automatically eat up all your cores.

I think it is not possible. When compiled with OpenMP the only available environment variable to control the number of threads is OMP_NUM_THREADS, which would determine the number of threads of the whole program. Hence the need for a single-threaded openblas.

fhr wrote:

An interesting point is that the 3 flavors of SuperLU relie on different algorithms, so even on one thread/process, they won't behave the same; so providing 3 different packages would make sense. For MUMPS, I would vote for a parallel build (i.e., depending on a true MPI), since installing MPI is not a great deal (it depends on almost nothing and takes maybe 20MB once installed)

I agree with you on this. Does MUMPS have different calls for single-thread, multi-thread for shared memory and multi-thread for distributed memory, like spooles?

fhr wrote:

So, could we maybe change the dependencies so that the packages rely on, let's say, "fastblas" (that would be provided by openBLAS or others)?

There is a problem when compiling a package that requires blas with a fastblas installed. Let's say you're compiling gmsh with openblas installed, the binary will link with with openblas.so.0. Then, if you decide to install another blas library, like atlas-lapack or stock blas, gmsh needs to be recompiled (or better re-linked) because the binary points to a non-existing library.

fhr wrote:

PS: I think I saw you at a conf (but we don't know each other); at least, I saw a talk by your homonymous (A.) smile

I think you saw my brother, Alfio. He works mainly on MultiGrid, but isn't a Linux user.

Offline

#5 2013-01-10 22:44:22

fhr
Member
Registered: 2013-01-08
Posts: 8

Re: Linear algebra

gborzi wrote:

I think it is not possible. When compiled with OpenMP the only available environment variable to control the number of threads is OMP_NUM_THREADS, which would determine the number of threads of the whole program. Hence the need for a single-threaded openblas.

Yes, otherwise that would be a problem for people who have OpenMP in their code but don't want multithreaded BLAS, I hadn't thought of that.

gborzi wrote:

I agree with you on this. Does MUMPS have different calls for single-thread, multi-thread for shared memory and multi-thread for distributed memory, like spooles?

MUMPS has only one code, written in MPI. There's no multithreading in the current release, so the only way to have multithreading is multithreaded BLAS. Usually we advise to put as much multithreading BLAS as possible (e.g., with 8 cores: 2MPI with 4-way threaded BLAS is often better than 8MPI with single threaded BLAS).

gborzi wrote:

There is a problem when compiling a package that requires blas with a fastblas installed. Let's say you're compiling gmsh with openblas installed, the binary will link with with openblas.so.0. Then, if you decide to install another blas library, like atlas-lapack or stock blas, gmsh needs to be recompiled (or better re-linked) because the binary points to a non-existing library.

I don't get this. gmsh points to libblas which is a link to whatever implementation of BLAS (openblas, mkl, etc) installed, isn't it?

Offline

#6 2013-01-11 16:27:09

gborzi
Member
Registered: 2013-01-05
Posts: 5

Re: Linear algebra

fhr wrote:

MUMPS has only one code, written in MPI. There's no multithreading in the current release, so the only way to have multithreading is multithreaded BLAS. Usually we advise to put as much multithreading BLAS as possible (e.g., with 8 cores: 2MPI with 4-way threaded BLAS is often better than 8MPI with single threaded BLAS).

Thanks for the info, I'll try MUMPS in the future.

fhr wrote:

I don't get this. gmsh points to libblas which is a link to whatever implementation of BLAS (openblas, mkl, etc) installed, isn't it?

When you link an executable, the linker takes the name of the library you're linking to (e.g. -lblas), searches for the corresponding name (libblas.so) in some standard and eventually user specified dirs, finds /usr/lib/libblas.so which generally points to the real library (or to a link to the real library), extracts the soname of the dynamic library and finally links your executable to the soname(d) library, not to the one you gave at the start. For openblas the soname is libopenblas.so.0, so this command
$ gfortran -o testlu testlu.o fortimer.o dasum.o -llapack -lblas
creates a testlu executable linked to libopenblas.so.0
$ ldd testlu
    linux-vdso.so.1 (0x00007fffb4851000)
    liblapack.so => /usr/lib/liblapack.so (0x00007f00c232d000)
    libopenblas.so.0 => /usr/lib/libopenblas.so.0 (0x00007f00c1c4c000)
    libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007f00c1938000)
    libm.so.6 => /usr/lib/libm.so.6 (0x00007f00c1639000)
    libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f00c1424000)
    libquadmath.so.0 => /usr/lib/libquadmath.so.0 (0x00007f00c11ef000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f00c0e42000)
    libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f00c0a25000)
    libgomp.so.1 => /usr/lib/libgomp.so.1 (0x00007f00c0817000)
    /lib/ld-linux-x86-64.so.2 (0x00007f00c2b8f000)

If I install blas, which implies uninstalling openblas, the executable doesn't work anymore
$ ./testlu
./testlu: error while loading shared libraries: libopenblas.so.0: cannot open shared object file: No such file or directory

Offline

#7 2013-01-11 19:25:15

fhr
Member
Registered: 2013-01-08
Posts: 8

Re: Linear algebra

Ok, I get it, I didn't know it was like that. I guess there's good reasons for this behavior...

So, let's say you have the blas from the official repos and install blas-lapack from AUR (thus blas in uninstalled). The BLAS library in blas-lapack is called libblas.so (not libblas-atlas.so or something else), so does that allow your old executables still to work without being relinked? If yes, can we do the same with openBLAS?

Offline

#8 2013-01-12 13:18:29

gborzi
Member
Registered: 2013-01-05
Posts: 5

Re: Linear algebra

I suppose you mean atlas-lapack, not blas-lapack. If by "old executables" you mean programs linked with stock blas, e.g. octave from the repo, the answer is yes. They'll work with the new libblas, simply because there is a symlink named libblas.so.3 (the soname used when linking) to the actual library. However, if you recompile and relink any such program, it'll link to the actual soname, not to libblas.so.3.

Offline

Board footer

Powered by FluxBB