MPPNP

General Overview
Main codes. Note while PPN (single zone) does not handle convection, MPPNP (multi-zone) does. MPPNP calculations need a lot more computer resource than PPN. So, recommended to use a cluster computer, etc.

Check this website mppnp github wiki.

Installation
Basically, follow some ReadMe files in the directories. MPPNP GitHub wiki page.
 * I did try this installation at NuGrid meeting, 2019 September. It didn't work out in the end in neither my Mac or Linux (Sony; installation (make) was successful, but error messages in running mppnp; probably hdf is not well installed). So I will try at gca server later (it worked!). On 2019/10/18, I installed and run well on GC-master. On 2019/10/19, however, I installed but still got the same error message on Sony Linux (so I gave up using this computer because it will be anyway too slow to run mppnp); hdf5 was installed well from the source code (hdf5-1.10.5) and installed in /user/local/hdf5.

'Extremely Important!! Basically, items required below (specifically HDF5 and gcc/gfortran) are included in MESASDK and you can install MPPNP with them (recommended by me). Note after you install SE (basically you don't need to specify hdf5 because it's written in PATH/LD_LIBRARY_PATH) when you installed MESASDK), you should check by "ldd /usr/local/bin/touchse (or whereever touchse is created) that libse.so and libhdf5.so point at mesasdk directories, which assure you use MESASDK. So no need to install hdf5 from source. Also importantly and seemingly, MESASDK 2018 August version works well because this includes gcc/gfortran 7.2.0 (and create libhdf5.so.10). MESASDK 2019 with gcc/gfortran 8 or 9 created libhdf5.so.103 and it causes HDF5 error message when you run MPPNP. To make sure SE works ok, you can just run /usr/loca/bin/bin/touchse ****h5 (input file). If no problem, then nothing happens, but if it's not ok, you'll see same error message about HDF5. touchse is executable file to read and write hdf5 files and created in SE directory (that's what compiling SE is for). Note SE is written in C (probably because HDF5 is in C), while NuGrid codes are in Fortran. SE functions are called in NuGrid via FSE.h (#define function F77_FUNC(lowercase, capital case) which is standard way to interchange C and fortran)'

'Also important!! Using the method above, I couldn't compile mppnp on cycdhcp39 computer. So I switched gfortran ver to 9 (from 7) only when compile mppnp, it worked and I can run mppnp now. This is weird because I use gfortran7 from mesasdk to compile SE and gfortran9 for mppnp compiling (mppnp uses SE..). But anyway how it worked. '

Installation requirement is '''# COMBINED PROBLEM (SE and HDF5). You need to use same directory for hdf5 in SE and hdf5 LD_LIBRARY_PATH. For example, for GC-MASTER, I in the end used /gc-master-data/gacgroup/shuyaota/mesadk, which include "include" and "lib" directories for hdf5 (hdf5.h and libhdf.so.103). Note mesadk does not have libhdf5.so.103 (it has libhdf5.so.10), so I just copied it from anaconda3/lib. Note when I did this with anaconda3/lib or anaconda3/pkge/hdf5-****/lib, then I got an error message in running mppnp like "HDF5-DIAG: Error detected in HDF5 (1.10.3) thread 13********: #000: H5G.c line 299 in H5***: not a location".'''
 * 1) SE library (developped by NuGrid collaboration); download NuSE from NuGrid GitHub and compile it (./configure --> make --> make install). Don't forget to set path in Make.local file. Note sometimes when you don't have sudo/su access, SE's make install command gives permission error. That case, you can do "1. make a new directory somewhere like "/home/shuyaota/bin", 2. go to SE directory and "./configure --prefix=/home/shuyaota/bin", 3. "make", "make install", 4. set path in Make.local (SE="/home/shuyaota/bin" in this case). Also, note sometimes ./configure cannot find hdf5.h, which case you can use like "./configure --prefix=/home/gacgroup/shuyaota/bin --with-hdf5=/gc-master-data/shuyaota/mesadk" (I used hdf5.1.10.5 from mesa). Because anacoda3 has hdf5.1.10.4 in lib and include directories (if you installed by "conda install -c anaconda hdf5"), you may want to use anaconda instead of mesa, but I don't know why but it just gave error message in running mppnp.
 * 2) OpenMPI. Linux (Ubuntu) is easy to install (as far as I remember, not big trouble), but but Mac might be have some problems. Anyway, Mac is not ideal for running MPPNP. 2019/9/18, I installed OpenMPI by "sudo apt-get install openmpi-bin openmpi-common openssh-client openssh-server libopenmpi1.10 libopenmpi-dbg libopenmpi-dev" on GCA server. This pdf is useful OpenMPI. I set path in Make.local like /user/bin because this is where openmpi is installed. Also note, libopenmpi-dbg was not installed because it was not found on apt repository, but it's fine and mppnp working ok.
 * 3) HDF5. Download hdf5 from the official website and install from the source (ver 1-10-5). I don't know if it is possible to install by apt-get. For GCA's case, it was installed already, so I didn't do anything. My peronal Sony Linux has error message in mppnp, probably because hdf5 installation is not working well. In GC-Master, I installed via anaconda3 (see details in that part of description). But as of 2019/10/18, GC-Master is not working for mppnp (same error as Sony Linux; installation worked, but hdf error in running mppnp). NOTE, 191106, I installed this on my bs-41 Desktop (which is quite clean because I didn't install almost nothing there). I installed on /home/shuyaota/hdf5 and installing SE with the path worked. Note in hdf5's official website, .tar source file is for Unix and .zip is for Windows (see Readme file)
 * 4) hdf5-viewer (hdfview). This is not at all required, but convenient to check the .h5 file has no problem. When I installed by apt-get, then the viewer couldn't open the .h5 files correctly (nothing was seen). So I installed from the source (HDFView-3.1 Linux) by downloading from the HDF official site (run hdfview.sh), then I could see the .h5 files appropariately. For GCS, just apt-get install hdfviewer was enough to look into h5 files. Important!! hdfview can be installed by conda install -c eumetsat hdfview)

Now test with the Run_Template directory.
 * 1) cd run_template.
 * 2) Set the PCD path to "source" directory in Makefile of the directory like "PCD=../source"
 * 3) Follow the ReadMe file, i.e., ("make distclean" if you wish, and then) "make". This will create mppnp.exe. If you have error like undefined referece, check LD_LIBRARY_PATH is set correctly. "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/gacgroup/shuyaota/anaconda3/lib" is an example of GC-Master (after set this, installation worked).
 * 4) run by "mpirun -np 3 (or number of your cpu) mppnp.exe". You will see output files. You need to set data directory (datadir) and prefix appropriately in ppn_frame.input. Most likely you need to download these data file from CANFAR. I run with M25_z0.02 case on GCA and worked well.

2020/7/4, I installed mppnp on GAC-ACQDATA server (new cluster). It ended up success, but I needed to take different paths from the case in cycdhcp one (which same CentOS is installed). Primary differences are,


 * 1) I had to use same version of mesasdk. The latest one was r12778, but I used the same one in the cychdcp (12115), with the olde one (r-10398 for gfortran ver7.2). That is, I compiled NuSE with r-10398, and then switched to r12115 to compile mppnp with fortran ver 9.2.0.


 * 1) You probably don't need to set hdf5 path in ./configure for NuSE. When I did it, mppnp keeps showing error, but when I remove it, it works (you still have to keep path to SE/bin).


 * 1) Note, probably you need to set LD_LIBRARY_PATH with r-10398 mesasdk. Set in .bashrc.


 * 1) When I used r12778 (for both compiling and LD_LIBRARY_PATH), I got usual error messages (hdf 1-5-0...etc deactivate etc).


 * 1) Hard time understanding in touchse 's role. I thought when I run touchse ***.hdf5, and no problems, then thought it's ok as well in mppnp, but not really this time. For example, with the above successful setting, I still see the error in typing touchse, but not in mppnp. Need to figure out, but currently working, so I probably leave it for a while.

Examples
- I downloaded M25 Z0.02 h5 files from CANFAR and set these paths in ppn_frame.input (used wget url for downloading.) - "make" - "mpirun -np 8 mppnp.exe" <-- because gca server has 8 cores - "python abu_evolution.py" <-- I modified "abu_evolution.py" in ppn. Basically follow what nugridse.py said in the beginning of the code. Note you need to change to "c = nugridse.se" in line ~120 of data_plot.py. This specifies that we're using MPPNP not PPN. - With the above script, you can draw mass (radius) vs rho curve in a specific cycle, and so on. (note at this point, you just plot mass vs rho in one clcle. mass radius (obviously) and rho are functions of mass (radius) in the cycle).
 * 1) run_test (or run_template) (now working as of 2019/9/21)

- Downloaded h5 files from CANFAR as each of ppn_frame.input in the directories indicate (M6, M3, M5). - "make" - Before running "mpirun -np 8 mppnp.exe", note, in ppn_frame.input, iniabu = 11 if you want to set an abundance file (MPPNP wiki says 10, but it is wrong). - If you don't set initial abundance correctly, you won't see any output in iso_mass_f, etc in H5_out or H5_surf (only surface area). So you need to be extra careful. - run mppnp (see above). - Now using "analysis.py" which is based on "abu_evolution.py" I wrote above. Note the output file directory path is set correctly (H5_out; H5_surf maybe ok?). Also note you can limit the radius to see the abundance by mass_range=[]. amass_range is for atomic mass, not stellar mass (radius).
 * 1) example codes; mppnp_HBB, mppnp_Hcore_burning, mppnp_Hecore_burning (working in 2019 Sep., and improved python visualization, etc. All worked well!).


 * 1) One basic mistake is, don't forget to check 'last_restart' file. Sometimes you forget to reset to 0 1 in the file and get an error message when you just want to start the run from scratch.


 * 1) Note, sometimes mpirun shows some error like ".... READ -1, errnum=1". You can fix this by setting "export OMPI_MCA_btl_vader_single_copy_mechanism=none" in ~/.bashrc.


 * 1) Note, sometimes hdf5 output file is broken (not complete cycles are included or nothing inside etc). This can be confirmed by "h5ls" command or "hdfview" command. You can fix by just last_restart method.

How to run some computers on the same LAN in parallel
- This website is pretty helpful. https://mpitutorial.com/tutorials/running-an-mpi-cluster-within-a-lan/

- It is very strongly recommended to use computers with same conditions (OS, mpirun version, gcc/gfortran versions, etc). I tried to run Ubuntu 18 (mpirun 2.1) and Ubuntu 16 (mpirun 1.10) together, but failed. I upgraded the latter mpirun to 2.1, but didn't work.

- Most importantly, this is the command: "mpirun -np 12 -oversubscribe --mca btl_tcp_if_include 165.91.229.182/23,165.91.229.190/23 -hostfile ~/hostfile ./mppnp.exe" Here, the hostfile includes "master slots=4 (change lines) cychdcp190 slots=8". Oversubscribe is needed when you see error without it. Basically you should be able to use all processors in computer, but sometimes you get complains without this option. 165.91.229.182/23 means, first 23 bits, i.e., xxx.xxx.1-128 (8+8+7 bits), are network address (probably belong to TAMU), and the rests are host address (probably belong to TAMU). 165.91.229.182 (or 190) are ip adddress of two computers (check by "ip a").

- You need a lot of times "sudo" permissions for /etc/hosts, /etc/exports, exportfs, nfs installation, passwordless ssh setting, etc".

- Also note, firewall is disabled. When I did parallelize cyc and SonyLinux, SonyLinux firewall was active (so disabled and it worked). Commands to check firewall on CentOS are "systemctl status firewall.service (? need to check)" and "systemctl disable firewall...".

- Sometimes you need to restart/start nfs-server by "systemctl start nfs-server.service" (or simply "systemctl enable nfs-server.service" will give you continuously enabled nfs even if you reboot the computer")

- Note data directory on nfs and remote computer's directory to mount the nfs must be same path (location in absolute path). Otherwise you need to set symbolic link.

NOW (2021/1/22) GAC-ACQDATA + GC-MASTER server combined is ready
- run like this in the directory you want to run mppnp: mpirun -np 122 -oversubscribe --mca btl_tcp_if_include 165.91.228.3/23,165.91.228.164/23 -hostfile ~/hostfile ./mppnp.exe

- Currently, Gac-AcqData is main, and GC-master is sub (remote), so the host file is in the home directory of GC-master (although home directory is shared between the two servers now.

- This (GAC-ACQDATA + GC-Master) combination is easier than usual, because it more or less shares a lot of things in between already.

- Read the website above for setup, but mostly helped by Kris Hagel. The main thing we needed to do was set password-less ssh and turn off firewalls (both computers). That's it.

- Computation speed is very fast. For a 25M Z=0.02 MPPNP calculation, only took 7.5 hrs (12 hrs when using only one server).