Code Coffee is a bring-your-own-coffee discussion group on computing techniques and tools at Steward Observatory at the University of Arizona.
Announcements are distributed via the astro-code-coffee
mailing list.
Next meeting
Topic: TBD
Speaker: TBD
Time: TBD
Location: TBD
-
Introductory C programming for astrophysicists
by Professor Dimitrios Psaltis • Apr 16, 2019
Dr. Psaltis gave a one-hour crash course on C programming based on material from his class.
Slides (PDF)
Code listings
ex1.c
Compile with
gcc ./ex1.c -o ex1
, run with./ex1
.#include <stdio.h> int main(void) { float x; for (x = 0; x <= 1; x += 0.1) { printf("x = %10.8f f(x) %10.8f\n", x, x * x); } return 0; }
ex2.c
Compile with
gcc ./ex1.c -o ex2
, run with./ex2
.(Note: Linux needs you to compile with
gcc ./ex1.c -o ex2 -lm
. See this StackOverflow answer for historical context.)#include <stdio.h> #include <math.h> #include <time.h> #define Nrep 1000000 int main(void) { double x=1.3, a; double time, Mflops; int i; clock_t ticks1, ticks2; ticks1 = clock(); for (i=1;i<=Nrep;i++) { a=x+x; } ticks2=clock(); time=(1.0*(ticks2-ticks1))/CLOCKS_PER_SEC/Nrep; Mflops=1.e-6/time; printf("it took %e seconds\n",time); printf("this corresponds to %f MFLOPS\n",Mflops); return 0; }
Substitute the loop body (
a = x + x
) with lines from the following block, and see how the number of MFLOPS changes.a=i*x; a=i/x; a=i/x/x; a=i/(x*x); a=sin(x)*sin(x)+2.*cos(x)*cos(x); a=1.+cos(x)*cos(x); a=log(x); a=pow(x,5.); a=x*x*x*x*x; a=i/sqrt(pow(sin(x),2.000001)+2.*pow(cos(x),2.000001)); a=i*pow(1.+cos(x)*cos(x),-0.5);
-
Structure and Development of Computer Programs
by Gabriele Bozzola • Mar 19, 2019
Presentation materials: https://github.com/Sbozzolo/structure_development_tucson
Book recommendations:
- The Pragmatic Programmer - Andrew Hunt and David Thomas
- Clean Code - Robert C. Martin
-
Interactive data visualization with Bokeh
by Peter Senchyna • Feb 19, 2019
-
Using containers to move computations to HPC and the Cloud
by Joseph Long • Jan 22, 2019
Containers allow you to bundle up a program or script and all the packages and libraries it uses in a single archive that runs on any computer with a container runtime like Docker or Singularity (including the University HPC systems).
To follow the slides, you will need to create a DockerHub account and install Docker Desktop for your platform:
- Download for macOS
- Download for Windows
- Linux: consult the Manual for your distribution (e.g. instructions for Ubuntu)
The full text of the example Dockerfile we built is below:
FROM centos:6.10 RUN yum install epel-release -y RUN yum update -y RUN rpm --import http://li.nux.ro/download/nux/RPM-GPG-KEY-nux.ro RUN rpm -Uvh http://li.nux.ro/download/nux/dextop/el6/x86_64/nux-dextop-release-0-2.el6.nux.noarch.rpm RUN yum install ffmpeg ffmpeg-devel -y
-
git and GitHub for Research
by Joseph Long • Nov 13, 2018
Additional resources
- Cheat sheet by GitLab
- Hosting services
- Free services for students
- Software Carpentry lessons
-
How to accelerate your code in under 10 lines
by Rachel Smullen • Oct 23, 2018
Rachel Smullen gave a talk on using OpenACC, a system of hints for your C, C++, and Fortran code that enables a compiler to move computations to the GPU or other accelerators. She attended the International High Performance Computing Summer School last summer, which covered OpenACC, and was kind enough to share what she learned with all of us.
Example code to start from (exercise and solution in the slides)
There are also some slides from IHPCSS by John Urbanic that cover OpenACC in more detail:
-
Intro to CyVerse
by Eckhart Spalding • Feb 14, 2018
What is CyVerse?
CyVerse started out in 2008 as the NSF-funded ‘iPlant Collaborative’ project to help the changing needs of the life sciences, which have increasingly heavy computational demands (think protein folding, genetic phenotyping, etc. etc.), but have historically not had high-performance computing facilities. The project was rechristened ‘CyVerse’ in 2015.
Why should an astronomer care about it?
CyVerse cyberinfrastructure has the serendipitous effect of providing computational resources to basically anyone here at UA and >8,000 participating institutions.
Possible reasons to care:
- If you have computational needs that are greater than what your office machines can provide, but less than what would usually pass for HPC jobs
- If you want to calculate something now, and don’t want to put your job in a queue
- Very flexible for access remotely or from low-performance machines (basically all you need to do is ‘ssh’)
- This infrastructure serves as a lead-in for addressing needs that span the community, like interoperability and reproducibility.
This talk was presented in a Jupyter Notebook which can be viewed on GitHub or downloaded directly.
-
How-to Docker (with astroML)
by CK Chan • Feb 14, 2018
This is a minimal Docker tutorial using astroML as an example. Docker is needed in order to try it out. The Docker Community Edition (CE) is free and is avliable at the Docker Website. It supports all the major platforms from Linux (e.g., Ubuntu), to Mac OS X, and even Windows.
The supporting files for this tutorial are available in the repository for this site under downloads/2017-18/chan-howto-docker-astroML.
Dockerfile
is the main part of this tutorial. It tells Docker how to create a Docker image, which you can instanceize it into Docker containers on any machine with Docker—from your laptop to a powerful many-core virtual machine (VM) on CyVerse, to thousands of VMs that you launch with container orchestration platforms on Google Cloud Platform.Makefile
contains the commands that we will run during the tutorial.plot_spectrum_sum_of_norms.py
is an example from astroML. It is modified to run better in a container environment. -
Real Programmers Debug with Fire Extinguishers
by Craig Kulesa • Nov 15, 2017
Talk slides are available here. A hands-on session with real hardware is offered during the normal Code Coffee time slot (maybe 11/29 or 12/6); email Craig (ckulesa@email…) if you’re interested.
Resources
The Antarctic observatory used as an example in the talk is the High Elevation Antarctic Terahertz (HEAT) telescope at Ridge A.
Go build something and control it with software. Sparkfun and Adafruit have a lot of good resources and some nifty development boards to get you started.
Talking to hardware: SPI and I2C
Wikipedia links for I2C and SPI.
A very low-level example of performing SPI communication via a microcontroller in C is here.
For the python-centric, look at spidev and I2C, also this.
Talking to hardware: Serial (RS232, RS485, RS422)
If you get a USB-to-serial converter, ones with a Prolific PL2303 chipset will work generically under just about any operating system without fuss. They can be purchased with flying leads for about $5 and with a DB9 connector for $10-20.
The bible: Serial Programming Guide for POSIX Operating Systems.
The pySerial API is excellent and will get you to a working device quickly.
Talking to hardware: CAN bus
The SocketCAN or can-utils package will let you display, record, generate, or replay CAN traffic.
The python-can library is excellent.
Wrappers for astronomical hardware
Abstract your system onto the network using network sockets. Examples for basic client and server operations can be found in most languages like C and python. This lets you control your instrument using scripts or a GUI without impacting the hardware or low-level control software itself.
INDI, the Instrument Neutral Distributed Interface is a nifty way to wrap up multiple elements of an astronomical system into a clean, self-describing system.
-
Python + joblib: Make your computer work harder, and save yourself time
by Rachel Smullen and Joseph Long • Nov 8, 2017
The
joblib
package (pip install joblib
) provides helpers for easy parallelization and caching (memoization) of function outputs.Rachel presented examples of
joblib
’s Parallel helper. Her notebook is athttps://github.com/rsmullen/CodeCoffee/blob/master/CodeCoffee_joblib.ipynb
Joseph presented the principles behind and use of the
joblib
Memory helper for caching. His notebook is at -
Code Principles and Style
by Harry Krantz • Oct 18, 2017
Things you should do but probably don’t(and probably won’t).
Learn how to write better code that will be less fragile, easier to read and understand, and easier to use.
Download the slides
-
Intro to C++11
by Rixin Li • Oct 11, 2017
-
UA High Performance Computing Resources
by Rachel Smullen and Rixin Li • Oct 4, 2017
This presentation covers the high-performance computing resources available at the University of Arizona. The presentation materials are on Rachel Smullen’s GitHub at https://github.com/rsmullen/UAHPC
For posterity, they’re mirrored here below:
UA HPC Commands
The online documentation can be found here and is a good place to start if you have questions.
Logging in
To log in to the HPC system, from a campus network or the campus VPN, type
ssh -Y username@hpc.arizona.edu
You should come to the login node, called keymaster. You’ll see options to log in to either El Gato or Ocelote. Typing
elgato
orocelote
will not allow windows. You need to usessh -Y elgato
.To see what storage disks you have access to, use the command
uquota
.Loading software
Your profile on the login nodes for the supercomputers don’t come with any pre-loaded software. To see available packages, type
module avail
. Then, to load a specific package, typemodule load modulename
.For instance,
module load python/3.4.1
. To see what you have loaded, typemodule list
. (If you don’t want to do this every time, you can add these commands to your.bashrc
file.)Interacting with the scheduler
Ocelote uses a scheduler called PBS, while El Gato uses the LSF scheduler. The commands are similar, but different enough to be a pain.
El Gato
- To see a list of available queues, type
bqueues
. - To see your running jobs, type
bjobs
. - To see everyone’s jobs, use
bjobs -u all
.
Ocelote
- To see a list of available queues,
qstat -q
. - To see all of your running jobs, type
qstat -u username
. - To see everyone’s jobs, use
qstat
.
Running Jobs
Embarassingly Parallel Jobs
These are jobs where you want to execute the same command several times.
El Gato
Here is an example of an El Gato lsf script for an embarassingly parallel job. Save this in a file named something like
lsf.sh
.#!/bin/bash #BSUB -n 1 ## number of processors given to each process #BSUB -e err_somename_%I ## error files; make somename unique to other runs #BSUB -o out_somename_%I ## output notification files #BSUB -q "your queue" ## can be windfall, standard, or medium, depending on your advisor's allowed queues. #BSUB -u username #BSUB -J somename[start-finish] ## Give the job a name (somename) and then fill in the processes you want, eg [1-100] or [1,2,3] #BSUB -R "span[ptile=1]" ####BSUB -w "done(JobID|JobName)" ## Ask us about this fanciness #.${LSB_JOBINDEX} gives the run index 1,2,3... # use regular linux commands to copy/link executables, input files, etc., run python, or whatever else you want to do. It will run in the subdirectory some_directory/some_runname${LSB_JOBINDEX}/. mkdir some_directory mkdir some_directory/some_runname${LSB_JOBINDEX} cd some_directory/some_runname${LSB_JOBINDEX}/ echo "I'm Job number ${LSB_JOBINDEX}"
To execute this script, use
bsub < lsf.sh
. You can then check your job’s status withbjobs
.Ocelote
Here’s the same for Ocelote. The PBS scheduler is different in that you submit a job array. Save this script as something like
pbs.sh
.## choose windfall or standard #PBS -q queuename ## select nodes:cpus per node:memory per node #PBS -l select=1:ncpus=1:mem=6gb ## the name of your job #PBS -N jobname ## the name of your group, typically your advisor's username #PBS -W group_list=yourgroup ## how the scheduler fills in your nodes #PBS -l place=pack:shared ## the length of time for your job #PBS -l walltime=1:00:00 ## the indexes of your job array #PBS -J 1-5 ## the location for your error files; this must exist first #PBS -e errorfiles/ ## the location for your output files; this must exist first #PBS -o outfiles/ # Now you can use your normal linux commands # Run the program for individual core ${PBS_ARRAY_INDEX} echo "I'm Job number ${PBS_ARRAY_INDEX}"
You can submit your job with
qsub pbs.sh
and then you can check your job withqstat -u yourname -t
.Parallel Jobs
We can also run parallel jobs on a supercomputer. (After all, that’s what they were designed for!)
El Gato
Here’s an example MPI script. Save it in
lsf.sh
. You can get the code in Rixin’s directory at/home/u5/rixin/mpi_hello_world
.###======================================== #!/bin/bash # set the job name #BSUB -J mpi_test # set the number of cores in total #BSUB -n 32 # request 16 cores per node #BSUB -R "span[ptile=16]" # request standard output (stdout) to file lsf.out #BSUB -o lsf.out # request error output (stderr) to file lsf.err #BSUB -e lsf.err # set the queue for this job as windfall #BSUB -q "medium" #--------------------------------------------------------------------- ### load modules needed module load openmpi ### pre-execution work cd ~/mpi_hello_world make # compile the code, in this example case ### set directory for job execution cd ./elgato_sample_run ### run your program mpirun -np 32 ../mpi_hello_world > elgato_sample_output.txt ### end of script
Use the same commands to submit and check the status as before.
Ocelote
And the same for Ocelote
#!/bin/bash ##set the job name #PBS -N mpi_test ##set the PI group for this job #PBS -W group_list=kkratter ##set the queue for this job as windfall #PBS -q windfall ##request email when job begins and ends #PBS -m bea ##set the number of nodes, cores, and memory that will be used #PBS -l select=2:ncpus=28:mem=1gb ##specify "wallclock time" required for this job, hhh:mm:ss #PBS -l walltime=00:01:00 ##specify cpu time = walltime * num_cpus #PBS -l cput=1:00:00 ###load modules needed module load openmpi/gcc/2 ###pre-execution work cd ~/mpi_hello_world make # compile the code, in this example case ###set directory for job execution cd ./ocelote_sample_run ###run your executable program with begin and end date and time output date /usr/bin/time mpirun -np 56 ../mpi_hello_world > ocelote_sample_output.txt date
Killing jobs
If you realize you made a mistake, or you want to kill a job that has been running for too long, use
bkill jobid
on El Gato orqdel jobid[].head1
on Ocelote.Interactive Nodes
Do not, I repeat, DO NOT run programs on the login node. You’re using up resources for people that just want to check their job status! Instead, you can request an interactive node that lets you run programs from a compute node where you can use as much of the resources as you want.
To get an interactive node, you submit a job to the scheduler requesting interactive resources. On El Gato, use
bsub -XF -Is bash
and on Ocelote, useqsub -I -N jobname -W group_list=groupname -q yourqueue -l select=1:ncpus=28:mem=168gb -l cput=1:0:0 -l walltime=1:0:0
. - To see a list of available queues, type
-
Github hands on
by Ekta Patel & Nico Garavito • Sep 27, 2017
This is a hands on tutorial on github that will cover the following topics:
- Creating and setting up a github repository.
- How to do commits to your repository.
- How to create branches.
- How to do forks and pull requests.
Here are the slides!
Feel free to email as at:
ektapatel [at] email [dot] arizona [dot] edu
jngaravitoc [at] email [dot] arizona [dot] edu
-
Object Oriented Programming in Python
by Samuel Wyatt: Giver, Lover, Friend • Sep 20, 2017
You can find my talk as a Jupyter Notebook located here:
https://github.com/swyatt7/CC_ObjectOriented
It overviews the 4 pillars of Object Oriented Programming (OOP) then provides a basic tutorial on how to implement classes in Python. There are also some astronomical examples followed by some cool OOP aspects in Python.
If you ever have any questions, feel free to email me at swyatt@email.arizona.edu
-
Achieving maximum website
by Joseph Long • Sep 13, 2017
Slides (PDF): Download
Resources
GitHub Pages
Free hosting for small websites, decoupled from current institutional affiliation (but dependent on the continued generosity of a private company). GitHub Pages can automatically run Jekyll for you (see the section on static site generators).
Domainr
Check for availability of a domain name quickly.
Static site generators
Static site generators run once after you update your site and regenerate any modified HTML pages. Maintenance and performance wise, static HTML pages are way nicer than dynamically generated sites (think, in order of decade: SSI, PHP, Perl, Python, Ruby, etc.).
Jekyll is written in Ruby and has pretty good documentation.
Pelican is written in Python and has a comparable feature set, but the documentation is a bit confusing. (I like the template syntax it uses a little more.)
Experimenting with HTML/CSS/JS
Two useful sites for experimenting with HTML markup, CSS, or JavaScript code (e.g. to produce a minimal example of some issue you have) are jsfiddle.net (which I showed) and CodePen.io. They offer a multi-panel editor where you can see the resulting page in the same browser window.
subscribe via RSS