Code Coffee is a bring-your-own-coffee discussion group on computing techniques and tools at Steward Observatory at the University of Arizona.

Announcements are distributed via the astro-code-coffee mailing list.

Next meeting

Topic: TBD

Speaker: TBD

Time: TBD

Location: TBD

Introductory C programming for astrophysicists

by Professor Dimitrios Psaltis • Apr 16, 2019

Dr. Psaltis gave a one-hour crash course on C programming based on material from his class.

Slides (PDF)

Code listings

`ex1.c`

Compile with gcc ./ex1.c -o ex1, run with ./ex1.

#include <stdio.h>

int main(void) {
    float x;
    for (x = 0; x <= 1; x += 0.1) {
        printf("x = %10.8f f(x) %10.8f\n", x, x * x);
    }
    return 0;
}

`ex2.c`

Compile with gcc ./ex1.c -o ex2, run with ./ex2.

(Note: Linux needs you to compile with gcc ./ex1.c -o ex2 -lm. See this StackOverflow answer for historical context.)

#include <stdio.h>
#include <math.h>
#include <time.h>
#define Nrep 1000000
int main(void) {
    double x=1.3, a;
    double time, Mflops;
    int i;
    clock_t ticks1, ticks2;
    ticks1 = clock();
    for (i=1;i<=Nrep;i++) {
        a=x+x;
    }
    ticks2=clock();
    time=(1.0*(ticks2-ticks1))/CLOCKS_PER_SEC/Nrep;
    Mflops=1.e-6/time;
    printf("it took %e seconds\n",time);
    printf("this corresponds to %f MFLOPS\n",Mflops);
    return 0;
}

Substitute the loop body (a = x + x) with lines from the following block, and see how the number of MFLOPS changes.

a=i*x;
a=i/x;
a=i/x/x;
a=i/(x*x);
a=sin(x)*sin(x)+2.*cos(x)*cos(x);
a=1.+cos(x)*cos(x);
a=log(x);
a=pow(x,5.);
a=x*x*x*x*x;
a=i/sqrt(pow(sin(x),2.000001)+2.*pow(cos(x),2.000001));
a=i*pow(1.+cos(x)*cos(x),-0.5);

Structure and Development of Computer Programs

by Gabriele Bozzola • Mar 19, 2019
Slides (PDF)

Presentation materials: https://github.com/Sbozzolo/structure_development_tucson

Book recommendations:
- The Pragmatic Programmer - Andrew Hunt and David Thomas
- Clean Code - Robert C. Martin
Interactive data visualization with Bokeh

by Peter Senchyna • Feb 19, 2019

Slides

Examples
Using containers to move computations to HPC and the Cloud

by Joseph Long • Jan 22, 2019
Containers allow you to bundle up a program or script and all the packages and libraries it uses in a single archive that runs on any computer with a container runtime like Docker or Singularity (including the University HPC systems).

To follow the slides, you will need to create a DockerHub account and install Docker Desktop for your platform:
- Download for macOS
- Download for Windows
- Linux: consult the Manual for your distribution (e.g. instructions for Ubuntu)
The full text of the example Dockerfile we built is below:
```
FROM centos:6.10
RUN yum install epel-release -y
RUN yum update -y
RUN rpm --import http://li.nux.ro/download/nux/RPM-GPG-KEY-nux.ro
RUN rpm -Uvh http://li.nux.ro/download/nux/dextop/el6/x86_64/nux-dextop-release-0-2.el6.nux.noarch.rpm
RUN yum install ffmpeg ffmpeg-devel -y
```
git and GitHub for Research

by Joseph Long • Nov 13, 2018
Slides.

Additional resources
- Cheat sheet by GitLab
- Hosting services
  - https://bitbucket.org
  - https://about.gitlab.com
- Free services for students
  - https://education.github.com/students
- Software Carpentry lessons
  - http://swcarpentry.github.io/git-novice/
How to accelerate your code in under 10 lines

by Rachel Smullen • Oct 23, 2018
Rachel Smullen gave a talk on using OpenACC, a system of hints for your C, C++, and Fortran code that enables a compiler to move computations to the GPU or other accelerators. She attended the International High Performance Computing Summer School last summer, which covered OpenACC, and was kind enough to share what she learned with all of us.

Her slides

Example code to start from (exercise and solution in the slides)

There are also some slides from IHPCSS by John Urbanic that cover OpenACC in more detail:
Intro to CyVerse

by Eckhart Spalding • Feb 14, 2018
What is CyVerse?

CyVerse started out in 2008 as the NSF-funded ‘iPlant Collaborative’ project to help the changing needs of the life sciences, which have increasingly heavy computational demands (think protein folding, genetic phenotyping, etc. etc.), but have historically not had high-performance computing facilities. The project was rechristened ‘CyVerse’ in 2015.

Why should an astronomer care about it?

CyVerse cyberinfrastructure has the serendipitous effect of providing computational resources to basically anyone here at UA and >8,000 participating institutions.

Possible reasons to care:
- If you have computational needs that are greater than what your office machines can provide, but less than what would usually pass for HPC jobs
- If you want to calculate something now, and don’t want to put your job in a queue
- Very flexible for access remotely or from low-performance machines (basically all you need to do is ‘ssh’)
- This infrastructure serves as a lead-in for addressing needs that span the community, like interoperability and reproducibility.
This talk was presented in a Jupyter Notebook which can be viewed on GitHub or downloaded directly.
How-to Docker (with astroML)

by CK Chan • Feb 14, 2018

Slides (PDF)

This is a minimal Docker tutorial using astroML as an example. Docker is needed in order to try it out. The Docker Community Edition (CE) is free and is avliable at the Docker Website. It supports all the major platforms from Linux (e.g., Ubuntu), to Mac OS X, and even Windows.

The supporting files for this tutorial are available in the repository for this site under downloads/2017-18/chan-howto-docker-astroML.

Dockerfile is the main part of this tutorial. It tells Docker how to create a Docker image, which you can instanceize it into Docker containers on any machine with Docker—from your laptop to a powerful many-core virtual machine (VM) on CyVerse, to thousands of VMs that you launch with container orchestration platforms on Google Cloud Platform.

Makefile contains the commands that we will run during the tutorial.

plot_spectrum_sum_of_norms.py is an example from astroML. It is modified to run better in a container environment.
Real Programmers Debug with Fire Extinguishers

by Craig Kulesa • Nov 15, 2017

Talk slides are available here. A hands-on session with real hardware is offered during the normal Code Coffee time slot (maybe 11/29 or 12/6); email Craig (ckulesa@email…) if you’re interested.

Resources

The Antarctic observatory used as an example in the talk is the High Elevation Antarctic Terahertz (HEAT) telescope at Ridge A.

Go build something and control it with software. Sparkfun and Adafruit have a lot of good resources and some nifty development boards to get you started.

Talking to hardware: SPI and I2C

Wikipedia links for I2C and SPI.

A very low-level example of performing SPI communication via a microcontroller in C is here.

For the python-centric, look at spidev and I2C, also this.

Talking to hardware: Serial (RS232, RS485, RS422)

If you get a USB-to-serial converter, ones with a Prolific PL2303 chipset will work generically under just about any operating system without fuss. They can be purchased with flying leads for about $5 and with a DB9 connector for $10-20.

The bible: Serial Programming Guide for POSIX Operating Systems.

The pySerial API is excellent and will get you to a working device quickly.

Talking to hardware: CAN bus

The SocketCAN or can-utils package will let you display, record, generate, or replay CAN traffic.

The python-can library is excellent.

Wrappers for astronomical hardware

Abstract your system onto the network using network sockets. Examples for basic client and server operations can be found in most languages like C and python. This lets you control your instrument using scripts or a GUI without impacting the hardware or low-level control software itself.

INDI, the Instrument Neutral Distributed Interface is a nifty way to wrap up multiple elements of an astronomical system into a clean, self-describing system.
Python + joblib: Make your computer work harder, and save yourself time

by Rachel Smullen and Joseph Long • Nov 8, 2017

The joblib package (pip install joblib) provides helpers for easy parallelization and caching (memoization) of function outputs.

Rachel presented examples of joblib’s Parallel helper. Her notebook is at

https://github.com/rsmullen/CodeCoffee/blob/master/CodeCoffee_joblib.ipynb

Joseph presented the principles behind and use of the joblib Memory helper for caching. His notebook is at

https://github.com/ua-astro-grads/ua-astro-grads.github.io/blob/master/downloads/2017-18/joblib_memoization.ipynb
Code Principles and Style

by Harry Krantz • Oct 18, 2017

Things you should do but probably don’t(and probably won’t).

Learn how to write better code that will be less fragile, easier to read and understand, and easier to use.

Download the slides
Intro to C++11

by Rixin Li • Oct 11, 2017

Slides (PDF).

UA High Performance Computing Resources

by Rachel Smullen and Rixin Li • Oct 4, 2017

This presentation covers the high-performance computing resources available at the University of Arizona. The presentation materials are on Rachel Smullen’s GitHub at https://github.com/rsmullen/UAHPC

For posterity, they’re mirrored here below:

Slides (PDF)

UA HPC Commands

The online documentation can be found here and is a good place to start if you have questions.

Logging in

To log in to the HPC system, from a campus network or the campus VPN, type

ssh -Y username@hpc.arizona.edu

You should come to the login node, called keymaster. You’ll see options to log in to either El Gato or Ocelote. Typing elgato or ocelote will not allow windows. You need to use ssh -Y elgato.

To see what storage disks you have access to, use the command uquota.

Loading software

Your profile on the login nodes for the supercomputers don’t come with any pre-loaded software. To see available packages, type module avail. Then, to load a specific package, type module load modulename.

For instance, module load python/3.4.1. To see what you have loaded, type module list. (If you don’t want to do this every time, you can add these commands to your .bashrc file.)

Interacting with the scheduler

Ocelote uses a scheduler called PBS, while El Gato uses the LSF scheduler. The commands are similar, but different enough to be a pain.

El Gato

To see a list of available queues, type bqueues.
To see your running jobs, type bjobs.
To see everyone’s jobs, use bjobs -u all.

Ocelote

To see a list of available queues, qstat -q.
To see all of your running jobs, type qstat -u username.
To see everyone’s jobs, use qstat.

Running Jobs

Embarassingly Parallel Jobs

These are jobs where you want to execute the same command several times.

El Gato

Here is an example of an El Gato lsf script for an embarassingly parallel job. Save this in a file named something like lsf.sh.

#!/bin/bash
#BSUB -n 1                         ## number of processors given to each process
#BSUB -e err_somename_%I           ## error files; make somename unique to other runs
#BSUB -o out_somename_%I           ## output notification files
#BSUB -q "your queue"              ## can be windfall, standard, or medium, depending on your advisor's allowed queues.
#BSUB -u username
#BSUB -J somename[start-finish]    ## Give the job a name (somename) and then fill in the processes you want, eg [1-100] or [1,2,3]
#BSUB -R "span[ptile=1]"
####BSUB -w "done(JobID|JobName)"  ## Ask us about this fanciness

#.${LSB_JOBINDEX} gives the run index 1,2,3...

# use regular linux commands to copy/link executables, input files, etc., run python, or whatever else you want to do.  It will run in the subdirectory some_directory/some_runname${LSB_JOBINDEX}/. 

mkdir some_directory
mkdir some_directory/some_runname${LSB_JOBINDEX}
cd some_directory/some_runname${LSB_JOBINDEX}/
echo "I'm Job number ${LSB_JOBINDEX}"

To execute this script, use bsub < lsf.sh. You can then check your job’s status with bjobs.

Ocelote

Here’s the same for Ocelote. The PBS scheduler is different in that you submit a job array. Save this script as something like pbs.sh.

## choose windfall or standard
#PBS -q queuename
## select nodes:cpus per node:memory per node
#PBS -l select=1:ncpus=1:mem=6gb
## the name of your job
#PBS -N jobname
## the name of your group, typically your advisor's username
#PBS -W group_list=yourgroup
## how the scheduler fills in your nodes
#PBS -l place=pack:shared
## the length of time for your job
#PBS -l walltime=1:00:00
## the indexes of your job array
#PBS -J 1-5
## the location for your error files; this must exist first
#PBS -e errorfiles/
## the location for your output files; this must exist first
#PBS -o outfiles/

# Now you can use your normal linux commands

# Run the program for individual core ${PBS_ARRAY_INDEX}
echo  "I'm Job number ${PBS_ARRAY_INDEX}"

You can submit your job with qsub pbs.sh and then you can check your job with qstat -u yourname -t.

Parallel Jobs

We can also run parallel jobs on a supercomputer. (After all, that’s what they were designed for!)

El Gato

Here’s an example MPI script. Save it in lsf.sh. You can get the code in Rixin’s directory at /home/u5/rixin/mpi_hello_world.

###========================================
#!/bin/bash
# set the job name
#BSUB -J mpi_test
# set the number of cores in total
#BSUB -n 32
# request 16 cores per node
#BSUB -R "span[ptile=16]"
# request standard output (stdout) to file lsf.out
#BSUB -o lsf.out
# request error output (stderr) to file lsf.err
#BSUB -e lsf.err
# set the queue for this job as windfall
#BSUB -q "medium"
#---------------------------------------------------------------------

### load modules needed
module load openmpi

### pre-execution work
cd ~/mpi_hello_world
make # compile the code, in this example case

### set directory for job execution
cd ./elgato_sample_run
### run your program
mpirun -np 32 ../mpi_hello_world > elgato_sample_output.txt
### end of script

Use the same commands to submit and check the status as before.

Ocelote

And the same for Ocelote

#!/bin/bash
##set the job name
#PBS -N mpi_test
##set the PI group for this job
#PBS -W group_list=kkratter
##set the queue for this job as windfall
#PBS -q windfall
##request email when job begins and ends
#PBS -m bea
##set the number of nodes, cores, and memory that will be used
#PBS -l select=2:ncpus=28:mem=1gb
##specify "wallclock time" required for this job, hhh:mm:ss
#PBS -l walltime=00:01:00
##specify cpu time = walltime * num_cpus
#PBS -l cput=1:00:00

###load modules needed
module load openmpi/gcc/2

###pre-execution work
cd ~/mpi_hello_world
make # compile the code, in this example case

###set directory for job execution
cd ./ocelote_sample_run
###run your executable program with begin and end date and time output
date
/usr/bin/time mpirun -np 56 ../mpi_hello_world > ocelote_sample_output.txt
date

Killing jobs

If you realize you made a mistake, or you want to kill a job that has been running for too long, use bkill jobid on El Gato or qdel jobid[].head1 on Ocelote.

Interactive Nodes

Do not, I repeat, DO NOT run programs on the login node. You’re using up resources for people that just want to check their job status! Instead, you can request an interactive node that lets you run programs from a compute node where you can use as much of the resources as you want.

To get an interactive node, you submit a job to the scheduler requesting interactive resources. On El Gato, use bsub -XF -Is bash and on Ocelote, use qsub -I -N jobname -W group_list=groupname -q yourqueue -l select=1:ncpus=28:mem=168gb -l cput=1:0:0 -l walltime=1:0:0.

Github hands on

by Ekta Patel & Nico Garavito • Sep 27, 2017
This is a hands on tutorial on github that will cover the following topics:
- Creating and setting up a github repository.
- How to do commits to your repository.
- How to create branches.
- How to do forks and pull requests.
Here are the slides!

Feel free to email as at:

ektapatel [at] email [dot] arizona [dot] edu

jngaravitoc [at] email [dot] arizona [dot] edu
Object Oriented Programming in Python

by Samuel Wyatt: Giver, Lover, Friend • Sep 20, 2017

You can find my talk as a Jupyter Notebook located here:

https://github.com/swyatt7/CC_ObjectOriented

It overviews the 4 pillars of Object Oriented Programming (OOP) then provides a basic tutorial on how to implement classes in Python. There are also some astronomical examples followed by some cool OOP aspects in Python.

If you ever have any questions, feel free to email me at swyatt@email.arizona.edu
Achieving maximum website

by Joseph Long • Sep 13, 2017

Slides (PDF): Download

Resources

GitHub Pages

Free hosting for small websites, decoupled from current institutional affiliation (but dependent on the continued generosity of a private company). GitHub Pages can automatically run Jekyll for you (see the section on static site generators).

Domainr

Check for availability of a domain name quickly.

Static site generators

Static site generators run once after you update your site and regenerate any modified HTML pages. Maintenance and performance wise, static HTML pages are way nicer than dynamically generated sites (think, in order of decade: SSI, PHP, Perl, Python, Ruby, etc.).

Jekyll is written in Ruby and has pretty good documentation.

Pelican is written in Python and has a comparable feature set, but the documentation is a bit confusing. (I like the template syntax it uses a little more.)

Experimenting with HTML/CSS/JS

Two useful sites for experimenting with HTML markup, CSS, or JavaScript code (e.g. to produce a minimal example of some issue you have) are jsfiddle.net (which I showed) and CodePen.io. They offer a multi-panel editor where you can see the resulting page in the same browser window.

Next meeting

by Professor Dimitrios Psaltis • Apr 16, 2019

Code listings

ex1.c

ex2.c

by Gabriele Bozzola • Mar 19, 2019

by Peter Senchyna • Feb 19, 2019

by Joseph Long • Jan 22, 2019

by Joseph Long • Nov 13, 2018

Additional resources

by Rachel Smullen • Oct 23, 2018

by Eckhart Spalding • Feb 14, 2018

What is CyVerse?

Why should an astronomer care about it?

by CK Chan • Feb 14, 2018

by Craig Kulesa • Nov 15, 2017

Resources

Talking to hardware: SPI and I2C

Talking to hardware: Serial (RS232, RS485, RS422)

Talking to hardware: CAN bus

Wrappers for astronomical hardware

by Rachel Smullen and Joseph Long • Nov 8, 2017

by Harry Krantz • Oct 18, 2017

by Rixin Li • Oct 11, 2017

by Rachel Smullen and Rixin Li • Oct 4, 2017

UA HPC Commands

Logging in

Loading software

Interacting with the scheduler

El Gato

Ocelote

Running Jobs

Embarassingly Parallel Jobs

El Gato

Ocelote

Parallel Jobs

El Gato

Ocelote

Killing jobs

Interactive Nodes

by Ekta Patel & Nico Garavito • Sep 27, 2017

by Samuel Wyatt: Giver, Lover, Friend • Sep 20, 2017

by Joseph Long • Sep 13, 2017

Resources

GitHub Pages

Domainr

Static site generators

Experimenting with HTML/CSS/JS

`ex1.c`

`ex2.c`