+ - 0:00:00
Notes for current slide

Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.

Press P again to switch presenter notes off

Press C to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other.

Useful when presenting.

Notes for next slide



Galaxy Interactive Environments



last_modification Updated:   purlPURL: gxy.io/GTN:S00048

text-document Plain-text slides |

Tip: press P to view the presenter notes | arrow-keys Use arrow keys to move between slides
1 / 37

Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.

Press P again to switch presenter notes off

Press C to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other.

Useful when presenting.

Requirements

Before diving into this slide deck, we recommend you to have a look at:

  • Docker basics
2 / 37

question Questions

  • What are Galaxy Interactive Environments (GIEs)?

  • How to enable GIEs in Galaxy?

  • How to develop your own GIE?

3 / 37

objectives Objectives

  • Implement a Hello-World Galaxy Interactive Environment
4 / 37

Interactive Environments

5 / 37

Why IEs?

  • Embedded access to third-party application inside of Galaxy
  • Interactively analyze data, access analysis products within Galaxy

Ipython is shown in the center pane of galaxy

  • Bring external analysis platform to the data instead of vice-versa
    • no need to download/re-upload your data
6 / 37

Who should use IEs?

  • Everyone!
  • Programming Environments for Bioinformaticians and Data Scientists
    • Jupyter
    • Rstudio
  • Visualization IEs are great for life scientists
    • IOBIO (bam/vcf visualizations)
    • Phinch (metagenomics visualizations)
7 / 37

Types of visualizations in Galaxy

GIE for visualization? Check that it is the right choice for your project

  • Trackster - built-in genome browser
  • Display applications
    • UCSC Genome Browser
    • IGV
  • Galaxy tools
    • JBrowse
    • Krona
  • Visualization plugins
    • Charts
    • Generic
  • Interactive Environments
    • Jupyter/Rstudio
    • IOBIO (bam/vcf visualizations)
    • Phinch (metagenomics visualizations)
8 / 37

Which should I use?

Flowchart. Only available on an external website? If yes use a display application. Does it need to be served (e.g. python), if yes use an interactive tool. Is it computationally intensive, then it needs to be a regular tool. Is it written in javascript? Then it shold be a generic plugin. If it passes all these tests it can be a charts plugin.

9 / 37

How to launch an IE?

  • Can be bound to specific datatypes
    • Available under the visualizations button on the dataset

visualisation button in galaxy is clicked on a dataset. Jupyter and Rstudio appear below charts.

  • Or more general-purpose applications (Jupyter/Rstudio)
  • IE launcher

A drop down menu from the masthead of galaxy shows interactive environments as well.

10 / 37

IE Launcher

  • Choose between different available docker images
  • Attach one or more datasets from history

An IE launcher is shown with select boxes for the GIE, image, and datasets. Launch is shown below.

11 / 37

How does it work?

  • Docker Containers are launched on-demand by users..
  • ..and killed automatically when users stop using them

schematic of request flow. A users requests an IE session with galaxy which connects to a docker host, launches the IE, and communicates this to a NodeJS proxy living next to galaxy. The user opens the connection through the proxy to access the container.

Admin Docs: https://docs.galaxyproject.org/en/master/admin/interactive_environments.html

12 / 37

Jupyter

  • General purpose/ multi-dataset
  • Provides special functions to interact with the Galaxy history (get/put datasets)
  • Ability to save and load notebooks

jupyter seen in the main panel of galaxy

13 / 37

Jupyter

the above screenshot but they have scrolled to show embedded plots computed in jupyter in galaxy

14 / 37

Jupyter

Another screenshot of jupyter in galaxy, this time fetching datasets from galaxy and running samtools.

15 / 37

Rstudio

  • General purpose/ multi-dataset
  • Provides special functions to interact with the Galaxy history
  • Ability to save and load workbook and R history object

default rstudio in galaxy, multiple panes are visible for computing, values, and files.

16 / 37

IOBIO

  • Visualizes single dataset
  • Only available for datasets of specific formats

IOBIO are dynamic javascript-y visualisations with lots of constantly updating graphs that are recalculated on the fly. They are shown in two scratchbooks in Galaxy

17 / 37

IOBIO

a close up of iobio with several graphs and illegible text.

18 / 37

Phinch

Phinch is shown inside galaxy, another visually intensive graphing application for metagenomics data.

19 / 37

Admin

  • Prerequisites: NodeJs (npm) and Docker; Galaxy user must be able to talk to the docker daemon
  • Enable IEs in galaxy.yml
    interactive_environment_plugins_directory = config/plugins/interactive_environments
  • Install node proxy
    $ cd $GALAXY/lib/galaxy/web/proxy/js/
    $ npm install .
  • Can configure GIEs to run on another host

Advanced configurations: https://docs.galaxyproject.org/en/master/admin/interactive_environments.html

20 / 37

Development

  • Not hard to build!
  • All the magic is in:
    $GALAXY/config/plugins/interactive_environments/$ie_name/
Component File
Visualization Plugin Configuration ../config/${ie_name}.xml
IE specific Configuration ../config/${ie_name}.ini
Mako Template ../templates/${ie_name}.mako
21 / 37

Development

schematic of GIE with a box on the left labelled ipython.mako being invoked by a launch ipython notebook call. This has boxes for running a docker container and then proxying authentication. These point to a docker container with a config.yaml containing the history id, api key and password, the ipython galaxy notebook, and the ipython webservice which loops while the connection is active. This proxies the connection and points to a cartoon of ipython in galaxy's center panel.

22 / 37

Hello World Example

$ tree $GALAXY_ROOT/config/plugins/interactive_environments/helloworld/
config/plugins/interactive_environments/helloworld/
├── config
│ ├── helloworld.ini
│ ├── helloworld.ini.sample
│ └── helloworld.xml
├── static
│ └── js
│ └── helloworld.js
└── templates
└── helloworld.mako
23 / 37

Create GIE plugin XML file config/helloworld.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE interactive_environment SYSTEM "../../interactive_environments.dtd">
<!-- This is the name which will show up in the User's Browser -->
<interactive_environment name="HelloWorld">
<data_sources>
<data_source>
<model_class>HistoryDatasetAssociation</model_class>
<!-- filter which types of datasets are appropriate for this GIE -->
<test type="isinstance" test_attr="datatype"
result_type="datatype">tabular.Tabular</test>
<test type="isinstance" test_attr="datatype"
result_type="datatype">data.Text</test>
<to_param param_attr="id">dataset_id</to_param>
</data_source>
</data_sources>
<params>
<param type="dataset" var_name_in_template="hda" required="true">dataset_id</param>
</params>
<!-- Be sure that your entrypoint name is correct! -->
<entry_point entry_point_type="mako">helloworld.mako</entry_point>
</interactive_environment>
24 / 37

Set up .ini file, which controls docker interaction config/helloworld.ini.sample

[main]
# Unused
[docker]
# Command to execute docker. For example `sudo docker` or `docker-lxc`.
#command = docker {docker_args}
# The docker image name that should be started.
image = hello-ie
# Additional arguments that are passed to the `docker run` command.
#command_inject = --sig-proxy=true -e DEBUG=false
# URL to access the Galaxy API with from the spawn Docker container, if empty
# this falls back to galaxy.yml's galaxy_infrastructure_url and finally to the
# Docker host of the spawned container if that is also not set.
#galaxy_url =
# The Docker hostname. It can be useful to run the Docker daemon on a different
# host than Galaxy.
#docker_hostname = localhost
[..]
25 / 37
  • Create mako template templates/helloworld.mako
    • Loads configuration from ini file
    • launches docker container,
    • builds a URL to the correct endpoint through Galaxy NodeJS proxy
    • set environment variable CUSTOM to be passed to container
    • attach dataset selected by user (hda)
<%namespace name="ie" file="ie.mako" />
<%
# Sets ID and sets up a lot of other variables
ie_request.load_deploy_config()
# Define a volume that will be mounted into the container.
# This is a useful way to provide access to large files in the container,
# if the user knows ahead of time that they will need it.
user_file = ie_request.volume(
hda.file_name, '/import/file.dat', how='ro')
# Launch the IE. This builds and runs the docker command in the background.
ie_request.launch(
volumes=[user_file],
env_override={
'custom': '42'
}
)
[..]
26 / 37

(continued)

[..]
# Only once the container is launched can we template our URLs. The ie_request
# doesn't have all of the information needed until the container is running.
url = ie_request.url_template('${PROXY_URL}/helloworld/')
%>
<html>
<head>
${ ie.load_default_js() }
</head>
<body>
<script type="text/javascript">
${ ie.default_javascript_variables() }
var url = '${ url }';
${ ie.plugin_require_config() }
requirejs(['interactive_environments', 'plugin/helloworld'], function(){
load_notebook(url);
});
</script>
<div id="main" width="100%" height="100%">
</div>
</body>
</html>
27 / 37
Lastly we must write the load_notebook function, static/js/helloworld.js
function load_notebook(url){
$( document ).ready(function() {
test_ie_availability(url, function(){
append_notebook(url)
});
});
}
28 / 37

Hello World Example

  • The only thing missing now is the GIE (Docker) container itself
  • Container typically consists of:
    • Dockerfile
    • Proxy configuration (e.g. nginx)
    • Custom startup script/entrypoint
    • Script to monitor traffic and kill unused containers
    • The actual application for the users (here: simple python process which serves directory contents of /import folder of container)
29 / 37
FROM ubuntu:14.04
# These environment variables are passed from Galaxy to the container
# and help you enable connectivity to Galaxy from within the container.
# This means your user can import/export data from/to Galaxy.
ENV DEBIAN_FRONTEND=noninteractive \
API_KEY=none \
DEBUG=false \
PROXY_PREFIX=none \
GALAXY_URL=none \
GALAXY_WEB_PORT=10000 \
HISTORY_ID=none \
REMOTE_HOST=none
RUN apt-get -qq update && \
apt-get install --no-install-recommends -y \
wget procps nginx python python-pip net-tools nginx
# Our very important scripts. Make sure you've run `chmod +x startup.sh
# monitor_traffic.sh` outside of the container!
ADD ./startup.sh /startup.sh
ADD ./monitor_traffic.sh /monitor_traffic.sh
# /import will be the universal mount-point for IPython
# The Galaxy instance can copy in data that needs to be present to the
# container
RUN mkdir /import
[..]
30 / 37

(continued)

[..]
# Nginx configuration
COPY ./proxy.conf /proxy.conf
VOLUME ["/import"]
WORKDIR /import/
# EXTREMELY IMPORTANT! You must expose a SINGLE port on your container.
EXPOSE 80
CMD /startup.sh
31 / 37
  • Proxy configuration (nginx)
    • reverse proxy our directory listing process running on port 8000
server {
listen 80;
server_name localhost;
# Note the trailing slash used everywhere!
location PROXY_PREFIX/helloworld/ {
proxy_buffering off;
proxy_pass http://127.0.0.1:8000/;
proxy_redirect http://127.0.0.1:8000/ PROXY_PREFIX/helloworld/;
}
}
32 / 37
  • Create the startup.sh file which starts our directory listing service
#!/bin/bash
# First, replace the PROXY_PREFIX value in /proxy.conf with the value from
# the environment variable.
sed -i "s|PROXY_PREFIX|${PROXY_PREFIX}|" /proxy.conf;
# Then copy into the default location for ubuntu+nginx
cp /proxy.conf /etc/nginx/sites-enabled/default;
# Here you would normally start whatever service you want to start. In our
# example we start a simple directory listing service on port 8000
cd /import/ && python -mSimpleHTTPServer &
# Launch traffic monitor which will automatically kill the container if
# traffic stops
/monitor_traffic.sh &
# And finally launch nginx in foreground mode. This will make debugging
# easier as logs will be available from `docker logs ...`
nginx -g 'daemon off;'
33 / 37
  • Lastly, the script to monitor traffic and shut down if user is no longer connected, monitor_traffic.sh
#!/bin/bash
while true; do
sleep 60
if [ `netstat -t | grep -v CLOSE_WAIT | grep ':80' | wc -l` -lt 3 ]
then
pkill nginx
fi
done
34 / 37

Hello World Example

  • We are now ready to build our container, and try out our new GIE!
  • If the container is hosted on a service like Dockerhub or quay.io, it will be automatically fetched on first run.
$ cd hello-ie
$ docker build -t hello-ie .

A directory listing is shown in the center panel.

Try it yourself: https://github.com/hexylena/hello-world-interactive-environment

35 / 37

keypoints Key points

  • Interactive Environments offer access to third-party applications within Galaxy

  • Interactive Environments run in a docker images for sandboxing and easy dependency management

36 / 37

Thank You!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors!

Galaxy Training Network

Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.

37 / 37

Requirements

Before diving into this slide deck, we recommend you to have a look at:

  • Docker basics
2 / 37
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow