Testing datatype size in Fortran

by Tobias Hertkorn on February 2nd, 2007


#define writeTopic(x) write(0,*) ""; write(0,*) "TESTING SIZE OF ",x
#define getSizeOf(x) write(0,'(A30,A,I4)') #x,":", SIZEOF(x)
#define getSizeOfDesc(d,x) write(0,'(A30,A,I4)') d,":", SIZEOF(x);write(0,*) " Arg:",x

Program testSizeof
Integer testIntegerDefaultKind
Integer(KIND=1) testIntegerKind1
Integer(KIND=2) testIntegerKind2
Integer(KIND=4) testIntegerKind4
Integer(KIND=8) testIntegerKind8

Real testRealDefaultKind
Real(KIND=4) testRealKind4
Real(KIND=8) testRealKind8
Real(KIND=16) testRealKind16

Complex testComplexDefaultKind
Complex(KIND=4) testComplexKind4
Complex(KIND=8) testComplexKind8
Complex(KIND=16) testComplexKind16

writeTopic("Constants-Integer")
getSizeOf(0)
getSizeOf(0_1)
getSizeOf(0_2)
getSizeOf(0_4)
getSizeOf(0_8)
getSizeOf(1/2)
getSizeOf(1/2_8)
getSizeOf(1/Z'7FFFFFFFFFFFFFFF') ! compiles on ifort only if -i8 is specified
getSizeOf(1_8/2)
getSizeOf(1_8/Z'7FFFFFFFFFFFFFFF')

writeTopic("Constants-Real")
getSizeOf(0.)
getSizeOf(0._4)
getSizeOf(0._8)
getSizeOf(0._16)
getSizeOf(1./2)
getSizeOf(1./2_8)
getSizeOf(1./Z'7FFFFFFFFFFFFFFF') ! compiles on ifort only if -r8 is specified
getSizeOf(1._16/Z'7FFFFFFFFFFFFFFF')
getSizeOf(1/2.)
getSizeOf(1/2._16)
getSizeOf(1._16/2)

writeTopic("Constants-Complex")
getSizeOf((0.,0.))
getSizeOf((0._4,0._4))
getSizeOf((0._8,0._8))
getSizeOf((0._16,0._16))
getSizeOf((1.,1.)/2)
getSizeOf((1.,1.)/2_8)
getSizeOf((1.,1.)/Z'7FFFFFFFFFFFFFFF') ! interestingly this works without -r8 on ifort
getSizeOf((1.,1.)/2.)
getSizeOf((1.,1.)/2._8)
getSizeOf((1.,1.)/2._16)
getSizeOf(1/(1.,1.))
getSizeOf(1/(1._16,1._16))
getSizeOf((1._16,1._16)/2)

writeTopic("Integer")
testIntegerDefaultKind=0
testIntegerKind1=0_1
testIntegerKind2=0_2
testIntegerKind4=0_4
testIntegerKind8=0_8
getSizeOf(testIntegerDefaultKind)
getSizeOf(testIntegerKind1)
getSizeOf(testIntegerKind2)
getSizeOf(testIntegerKind4)
getSizeOf(testIntegerKind8)

writeTopic("Real")
testRealDefaultKind=0.
testRealKind4=0._4
testRealKind4=0.E0_4 ! 0.E0 is strictly speaking not enough since -r8 would change the behaviour
testRealKind8=0._8
testRealKind8=0.D0 !0.E0_8 would also be valid
testRealKind16=0._16
testRealKind16=0.Q0
getSizeOf(testRealDefaultKind)
getSizeOf(testRealKind4)
getSizeOf(testRealKind8)
getSizeOf(testRealKind16)

writeTopic("Complex")
testComplexDefaultKind=(0.,0.)
testComplexKind4=(0.E0_4,0.E0_4) ! See Real for additional notations
testComplexKind8=(0.E0_8,0.E0_8)
testComplexKind16=(0.E0_16,0.E0_16)
getSizeOf(testComplexDefaultKind)
getSizeOf(Real(testComplexDefaultKind))
getSizeOf(Imag(testComplexDefaultKind))
getSizeOf(testComplexKind4)
getSizeOf(Real(testComplexKind4))
getSizeOf(Imag(testComplexKind4))
getSizeOf(testComplexKind8)
getSizeOf(Real(testComplexKind8))
getSizeOf(Imag(testComplexKind8))
getSizeOf(testComplexKind16)
getSizeOf(Real(testComplexKind16))
getSizeOf(Imag(testComplexKind16))

writeTopic("Wrong assignments - Integer")
testIntegerKind8=Z'7FFFFFFFFFFFFFFF'
testIntegerDefaultKind=testIntegerKind8
getSizeOfDesc("Kind8 in default",testIntegerDefaultKind)
testIntegerKind1=testIntegerKind8
getSizeOfDesc("Kind8 in Kind1",testIntegerKind1)
testIntegerKind2=testIntegerKind8
getSizeOfDesc("Kind8 in Kind2",testIntegerKind2)
testIntegerKind4=testIntegerKind8
getSizeOfDesc("Kind8 in Kind4",testIntegerKind4)
testIntegerKind8=testIntegerKind8
getSizeOfDesc("Kind8 in Kind8",testIntegerKind8)
testIntegerDefaultKind=0_1
getSizeOfDesc("0_1 in default",testIntegerDefaultKind)
testIntegerDefaultKind=0_2
getSizeOfDesc("0_2 in default",testIntegerDefaultKind)
testIntegerDefaultKind=0_4
getSizeOfDesc("0_4 in default",testIntegerDefaultKind)
testIntegerDefaultKind=0_8
getSizeOfDesc("0_8 in default",testIntegerDefaultKind)
testIntegerDefaultKind=Z'7FFFFFFFFFFFFFFF' ! compiles on ifort only if -i8 is specified
getSizeOfDesc("8byte Hex in default",testIntegerDefaultKind)
testIntegerKind8=0_1
getSizeOfDesc("0_1 in Kind8",testIntegerKind8)
testIntegerKind8=0_2
getSizeOfDesc("0_2 in Kind8",testIntegerKind8)
testIntegerKind8=0_4
getSizeOfDesc("0_4 in Kind8",testIntegerKind8)
testIntegerKind8=0_8
getSizeOfDesc("0_8 in Kind8",testIntegerKind8)
end Program

This small codesnipplet shows the various sizes of different datatypes. This is especially interesting when specifying -rN or -iN when using the ifort. Also pay special attention how your compiler handles line 30, 41 and 125. For example IBM’s compiler f90 just gives the warning “Source is longer than target. Truncation will occur on the left.”. I wrote this small program to verify the correct behaviour of the various compiler I use.

February 2nd, 2007 5:08 pm | Comments (0)

Getting started using Microsoft Compute Cluster

by Tobias Hertkorn on January 5th, 2007

Well, strictly for academic reasons I wanted to find out how fast the new Computing Cluster Server by Microsoft really is – I took the chance and installed it on my fresh and new Fujitsu-Siemens Celsius M. Yes, I got it today from my vendor – and thought, well before installing all of Linux and WinXP on it – why not test my code on it that I am writing as part of my diploma thesis. Granted, this is not a real test of the speed of the CCS since it will be only one node and the code will run on the head knot. But it will for sure be a good test if the Fortran 90 code will compile and run. Hell, it will even be a good estimate how much CPU Cycles are used by the underlying 2003 Server and the MPICH framework. Because I will provide some data on how fast the same code performs on the same machine using Linux (Debian’s AMD64 port) and MPICH.

Some more specs about the setup:

Intel® Core 2 Duo E6400 / 2.13 GHz “Conroe” – L1 Cache 2x 32 KB – L2 Cache 2x 1024 KB
Chipset Intel® 975X Express
2 GB RAM DDR II SDRAM 667 MHz
1 x ST3250820AS – 250 GB – Serial ATA-300 – 7200 rpm
NVIDIA Quadro FX 1500 – 256 MB
Ethernet-Controller Broadcom BCM5751

Here are some test results by the one and only BOINC CPU benchmark (Single threaded):

Core2Duo E6400:
Measured floating point speed:	1990.18 million ops/sec
Measured integer speed		4160.19 million ops/sec
Pentium 4 Northwood 2.4 GHz:
Measured floating point speed	964.82 million ops/sec
Measured integer speed		1139.22 million ops/sec

But first things first – Installing
Well, right now I am forced to install a 2003 Server x64 using the provided CD 1 on http://www.microsoft.com/windowsserver2003/ccs/trial/installinstruct.mspx. Let’s see if all the necessary drivers are available for the x64 environment.

Damn, there is no driver for my Broadcom NetXtreme Gigabit Ethernet. Hmm. Let’s hope the 2003 Server recognizes my USB stick. Phew. It worked, and thanks god for my second PC and the excellent online support by Fujitsu-Siemens. But isn’t it annoying that there are about 5 restarts before a complete MS Installation is completed? Well, the stuff one has to do. ;-) Urgs. There is 200Megs worth of updates available. Good thing this is a very, very fast internet conn… And downloads from the update server are around 5MB/s. Well, at least the installation of the patches already shows some of the fastness and furiousness of this piece of hardware. Truly impressive.

Updates are still running. Damn. Well, let’s talk a bit about the code I am about to run. It is a two-dimensional turbulence simulation, using a pseudo spectral approach. That means that there are both calculations in Fourier space and real space. Therefore the code highly depends on FFT. Timesteping is done using a leap frog integration.

The whole code uses MPI to distribute memory consumption and computational power. Especially the convolution benefits from parallel computing. As well as the FFT routines which use FFTW2 since the code uses memory distributed FFT.

The code is usually run on the ALTIX at the LRZ in Munich or the PSI at MPI in Greifswald. For this particular test on CCS I pre-tested it on a Pentium 4 running Debian using MPICH and the Intel Fortran compiler ifort. Obviously the tests were single threaded, since only one CPU was available. I will use a setup (1024×1024 grid points) for this test that needed about 50 minutes to complete on that particular Pentium 4 (Prescott 3.2GHz).

Well, the updates are done. Time to pop in the second CD. Since it complains that there is no Active Directory available, I use the dcpromo.exe wizard to set up a domain and DNS Server on this machine. Which takes a couple of minutes. Aaaand another restart. This is starting to get to me!

Okay, let’s try this again. CD2. And lift-off. Since is going to be the only computing node I select “Create a new compute cluster with this server as the head node” plus I check “Use this head node as a compute node”. Hmm, it complains that this computer only has one NIC. Well, since I am not going to use the net for MPI, I don’t really care and choose to continue anyway. Uhoh, some more hotfixes to install. Smells like restart spirit… Oops. Something really screwed with my network settings. Ah, faulty DNS settings… Yes, first reboot after ICS Hotfix. Wow, this time it takes really long to boot. Granted, there are some more services to start now.

All updates are install and it is time to install the Microsoft Compute Cluster Pack. Done. There is a nice to do list that pops up. Obviously I have to create a network topology. And do some user management. My topology should be “All nodes only on public network”, which is the simplest. After the change of topology is done successfully the wizard informs me that I really should switch off the internal firewall. Okay. There even is a nice wizard for that. Done. Node Management is not necessary since it already knows that I will use the head node as compute node. Nice. Well, last step: User Management. Weeeell. Done?!? Where to go next? Let’s have a look at the Compute Cluster Administrator. Hmm. Right amount of Processors are detected and all are flagged as idle. Good. But I really have no idea how to start compiling my code.

Digging around some I came across this link:

http://windowshpc.net/blogs/apps_general/archive/2006/06/04/134.aspx

That would get me started, if I would be using VS 2005. Which could also be a lot of fun – but maybe later. I really want to get Fortran up and running. While digging around some more I am downloading a trial version of the Intel Fortran Compiler 9.1 for Windows.

Hmm. Damn, it is getting way too late. It’s past pi (3:14 ;-) ). Stop.

January 5th, 2007 4:12 am | Comments (0)

Getting Debian MPICH and icc/ifort ready on Core2Duo/amd64 port

by Tobias Hertkorn on January 3rd, 2007

Sorry, today’s post won’t be pretty and certainly not for people who are not familiar with Debian and installing. It’s basically just a short summary for me. But feel free to email me or post a comment if you need assistance.

Get the Intel Fortran and CC compiler. If you use them for non-commercial purposes it is free to download them.

http://registrationcenter.cps.intel.com/irc_nas/537/l_fc_c_9.1.036.tar.gz

http://registrationcenter.cps.intel.com/irc_nas/534/l_cc_c_9.1.042.tar.gz

Installing the stuff:
Make sure the ia32-libs package is installed, when running an amd64 port.
apt-get install ia32-libs
This is not the default on Debian!

Pointers on how to install the intel compilers:

http://ubuntuforums.org/showthread.php?t=149579

http://lunatics.at/index.php?op=Articles;article=2

Get MPICH2, unpack and configure it:
CC=icc CXX=icpc F77=ifort F90=ifort FFLAGS=’-fomit-frame-pointer -xT -O3 -ip’ CFLAGS=’-fomit-frame-pointer -xT -O3 -ip’ ./configure –prefix=/opt/mpich2 –with-device=ch3:shm –enable-fast –enable-f77

Make sure to include /opt/mpich2/bin in your path.
Get the fftw2, unpack and configure it:
CC=icc CXX=icpc F77=ifort F90=ifort FFLAGS=’-fomit-frame-pointer -xT -O3 -ip’ CFLAGS=’-fomit-frame-pointer -xT -O3 -ip’ ./configure –prefix=/opt/fftw-2.1.5 –enable-mpi –enable-sse2 –enable-i386-hacks

January 3rd, 2007 1:13 pm | Comments (0)
Tobi + C# = T# - Blogged blogoscoop