Frequently Asked Questions
Tutorial Questions
Defining a Learning Pathway
Hands-on: Defining a Learning PathwayLearning Pathways are sets of tutorials curated by community experts to form a coherent set of lessons around a topic, building up knowledge step by step.
To define a learning pathway, create a file in the
learning-pathways/
folder. An example file is also given in this folder (pathway-example.md). It should look something like this:---
layout: learning-pathway
title: Title of your pathway
description: |
Description of the pathway. What will be covered, what are the learning objectives, etc?
Make this as thorough as possible, 1-2 paragraphs. This appears on the index page that
lists all the learning paths, and at the top of the pathway page
tags: [some, keywords, here ]
cover-image: path/to/image.png # optional cover image, defaults to GTN logo
cover-image-alt: alt text for this image
pathway:
- section: "Module 1: Title"
description: |
description of the module. What will be covered, what should learners expect, etc.
tutorials:
- name: galaxy-intro-short
topic: introduction
- name: galaxy-intro-101
topic: introduction
- section: "Module 2: Title"
description: |
description of the tutorial
will be shown under the section title
tutorials:
- name: quality-control
topic: sequence-analysis
- name: mapping
topic: sequence-analysis
- name: general-introduction
topic: assembly
- name: chloroplast-assembly
topic: assembly
- name: "My non-GTN session"
external: true
link: "https://example.com"
type: hands_on # or 'slides'
# you can make as many sections as you want, with as many tutorials as you want
---
You can put some extra information here. Markdown syntax can be used. This is shown after the description on the pathway page, but not on the cards on the index page.And that’s it!
We are happy to receive contributions of learning pathways! Did you teach a workshop around a topic using GTN materials? Capture the program as a learning pathways for others to reuse!
Adr
GTN ADR: Image Storage
Context and Problem Statement
Contributors to the GTN have image and occasionally datasets they wish to include in the GTN. These datasets are generally quite small (kilobytes) but, are necessary for the understanding of a tutorial.
Decision Drivers
- We prioritise contributor UX very highly, we cannot ask them to learn multiple systems. Git + Markdown is already enough.
- We wish to be able to sufficiently serve the website offline, with just a clone.
Considered Options
- Storage in git directly
- In another system (e.g. S3)
- Allowing linked images anywhere on the internet.
Decision Outcome
Chosen option: “Storage in git directly”, because it is the simplest solution that meets our requirements, and doesn’t require development we cannot fund, and doesn’t risk dead links over time.
Consequences
- Good, because it is simple and doesn’t require additional development.
- Bad, because it will permanently inflate the size of the repository, and it will never decrease. (We can offset this with
Pros and Cons of the Options
Storage in S3
- Good, because it’s cheap and well known.
- Bad, because we would need to build a way for users to upload images as part of a GTN tutorial development, and then link to them in markdown.
- Bad, because then the website would not be hostable offline.
Hotlinking
- Good, because it’s easy for contributors
- Bad, because unnecessary impact on someone else’s bandwidth
- Bad, because the links will rot over time, images and tutorials will not be able to be followed.
GTN ADR: Why Jekyll and not another Static Site Generator (SSG)
Context and Problem Statement
We needed a static site generator for the GTN, one had to be chosen. We chose Jekyll because of it’s good integration with GitHub and GitHub Pages. Over time our requirements have changed but we still need one SSG.
Decision Drivers
- Must be easy for contributors to setup and use
- Needs to be relatively performant (full rebuilds may not take more than 2 minutes.)
- Must allow us to develop custom plugins
Considered Options
- Jekyll
- Hugo
- A javascript option
- Another SSG.
Decision Outcome
Chosen option: “Jekyll”, because of the amount of time and effort we have sunk into it over the years has made it a good platform for us, despite limitations.
Over time we have invested heavily into Jekyll, any choice to switch must take that into consideration. Consider the following output of
scc _plugins bin/
Language Files Lines Blanks Comments Code Complexity YAML 117 9830 71 33 9726 0 Ruby 90 14471 1795 2617 10059 1163 JSON 48 3075 0 0 3075 0 Python 24 3693 284 272 3137 310 Shell 21 1529 175 262 1092 84 JavaScript 5 299 38 19 242 48 Markdown 4 76 19 0 57 0 Dockerfile 2 60 15 1 44 14 Plain Text 2 18 0 0 18 0 BASH 1 51 8 4 39 1 CSS 1 3 0 0 3 0 Docker ignore 1 1 0 0 1 0 gitignore 1 123 0 0 123 0 Total 317 33229 2405 3208 27616 1620
- Estimated Cost to Develop (organic) $880,671
- Estimated Schedule Effort (organic) 13.11 months
- Estimated People Required (organic) 5.97
- Processed 1081253 bytes, 1.081 megabytes (SI)
This is a lot of code that would need to be rewritten if another language was ever chosen.
The YAML comprises our Kwalify Schemas. There is a good argument for moving to JSON Schema instead. The Ruby however is the bulk of the code that would need to be rewritten. It does a significant number of complex things:
- collecting and collating files off disk / in Jekyll’s Page model into “Learning Materials”, very large objects with hundreds of properties that are used to render each and every template.
- Generating hundreds of pages with a multitude of calculated properties. These would all need to be hand translated.
Additionally any layouts would need to be rewritten from our existing Liquid templates. Note that this is not the full set of templates.
Language Files Lines Blanks Comments Code Complexity HTML 69 5937 830 96 5011 0 Markdown 4 125 1 0 124 0 Total 73 6062 831 96 5135 0
- Estimated Cost to Develop (organic) $150,543
- Estimated Schedule Effort (organic) 6.70 months
- Estimated People Required (organic) 2.00
Consequences
- Good, because it works well for us and has scaled sufficiently to an incredible number of output pages (~7k html/22k files in a full GTN production deployment.) with acceptable build times (<5 minutes in prod, most of the action execution is taken up by contacting other servers, dependencies, and uploading the results.)
- Good, because it has a well supported ecosystem of plugins we can leverage for common tasks
- Good, because we can easily write our own plugins for many tasks.
- Bad, because we it remains difficult to install
- Bad, because people must know Ruby and very few people do (but it isn’t that hard to learn!)
Pros and Cons of the Options
Hugo
- Good, because it would be a single binary, easier to install
- Bad, because plugins do not exist, it does not have a way to hook the internals and work with them which we use extensively.
- Bad, because what plugins do exist, only exist as ‘shortcodes’ that are written in Go templates which are not as powerful as Ruby.
A JavaScript option
- Good, because we could re-use code from other places
- Bad, because the average lifetime of a JavaScript SSG is maybe one year.
- Bad, because they are also quite slow on average (Hub compile times are on the order of 10 minutes.)
GTN Architectural Decision Record Template
This is based on Markdown Architectural Decision Record and lets us record important decisions.
{short title, representative of solved problem and found solution}
Context and Problem Statement
{Describe the context and problem statement, e.g., in free form using two to three sentences or in the form of an illustrative story. You may want to articulate the problem in form of a question and add links to collaboration boards or issue management systems.}
Decision Drivers
- {decision driver 1, e.g., a force, facing concern, …}
- {decision driver 2, e.g., a force, facing concern, …}
- …
Considered Options
- {title of option 1}
- {title of option 2}
- {title of option 3}
- …
Decision Outcome
Chosen option: “{title of option 1}”, because {justification. e.g., only option, which meets k.o. criterion decision driver which resolves force {force} … comes out best (see below)}. Consequences
- Good, because {positive consequence, e.g., improvement of one or more desired qualities, …}
- Bad, because {negative consequence, e.g., compromising one or more desired qualities, …}
- …
Confirmation
{Describe how the implementation of/compliance with the ADR can/will be confirmed. Are the design that was decided for and its implementation in line with the decision made? E.g., a design/code review or a test with a library such as ArchUnit can help validate this. Not that although we classify this element as optional, it is included in many ADRs.}
Pros and Cons of the Options
{title of option 1}
{example | description | pointer to more information | …}
- Good, because {argument a}
- Good, because {argument b}
- Neutral, because {argument c}
- Bad, because {argument d}
- …
{title of other option}
{example description pointer to more information …}
- Good, because {argument a}
- Good, because {argument b}
- Neutral, because {argument c}
- Bad, because {argument d}
- …
More Information
{You might want to provide additional evidence/confidence for the decision outcome here and/or document the team agreement on the decision and/or define when/how this decision the decision should be realized and if/when it should be re-visited. Links to other decisions and resources might appear here as well.}
What is an Architectural Decision Record (ADR)?
ADRs are documents that captures an important architectural decision made along with its context and consequences.
We keep track of some of our important Architecture decisions using a template based on Markdown Architectural Decision Record.
We feel that it is important to document these decisions to help future GTN maintainers understand the context and consequences of the decisions made in the past.
A number of our decisions were made with very explicit intentions, usually to prioritise contributors and ensure they have the best possible experience, maximising this over technical complexity and engineering efforts that are required to support it.
Most of our ADRs follow this pattern: Learners and Contributors come first, developers and deployers will be considered where possible.
Github
Forking the GTN repository
- Go on the GitHub repository: github.com/galaxyproject/training-material
- Click on the Fork button (top-right corner of the page)
Syncing your Fork of the GTN
Whenever you want to contribute something new to the GTN, it is important to start with an up-to-date branch. To do this, you should always update the main branch of your fork, before creating a so-called feature branch, a branch where you make your changes.
- Point your browser to your fork of the GTN repository
- The url will be
https://github.com/<your username>/training_material
(replacing ‘your username’ with your GitHub username)You might see a message like “This branch is 367 commits behind galaxyproject/training-material:main.” as in the screenshot below.
Click the Sync Fork button on your fork to update it to the latest version.
- TIP: never work directly on your main branch, since that will make the sync process more difficult. Always create a new branch before committing your changes.
Updating the default branch from master to main
If you created your fork a long time ago, the default branch on your fork may still be called master instead of main
- Point your browser to your fork of the GTN repository
- The url will be
https://github.com/<your username>/training_material
(replacingwith your GitHub username) Check the default branch that is shown (at top left).
- Does it say
main
?
- Congrats, nothing to do, you can skip the rest of these steps
Does it say
master
? Then you need to update it, following the instructions below- Go to your fork’s settings (Click on the gear icon called “Settings”)
- Find “Branches” on the left
- If it says master you can click on the ⇆ icon to switch branches.
- Select
main
(it may not be present).- If it isn’t present, use the pencil icon to rename
master
tomain
.
Gtn
Annotating Pre-requisites
If you are adding a tutorial, annotating the pre-requisites is an important task! It will help ensure learners know what they need to know before starting the tutorial. They also let instructors plan a schedule optimally.
Internal requirements often include specific features of Galaxy you plan to use in your training material, and let learners know which tutorials to follow first, before starting your tutorial.
requirements:
- type: "internal"
topic_name: galaxy-interface
tutorials:
- collections
- upload-rulesOr you can have external requirements, which link to another site.
requirements:
-
type: "external"
title: "Trackster"
link: "https://wiki.galaxyproject.org/Learn/Visualization"Least commonly needed are software requirements. These are usually used in e.g. Galaxy Admin Training tutorials, but if you have specific software requirements, you can list them here:
requirements:
- type: none
title: "Web browser"
- type: none
title: "A linux-based machine or linux emulator"
- type: none
Input Histories & Answer Keys
Tutorials sometimes require significant amounts of data or data prepared in a very specific manner which often is shown to cause errors for learners that significantly affect downstream results. Input histories are an answer to that:
input_histories:
- label: "UseGalaxy.eu"
history: https://humancellatlas.usegalaxy.eu/u/wendi.bacon.training/h/cs1pre-processing-with-alevin---input-1
date: "2021-09-01"Additionally once the learner has gotten started, tutorials sometimes feature tools which produce stochastic outputs, or have very long-running steps. In these cases, the tutorial authors may provide answer histories to help learners verify that they are on the right track, or to enable them to catch up if they fall behind or something goes wrong.
answer_histories:
- label: "UseGalaxy.eu"
history: https://humancellatlas.usegalaxy.eu/u/j.jakiela/h/generating-a-single-cell-matrix-using-alevin-3
- label: "Older Alevin version"
history: https://humancellatlas.usegalaxy.eu/u/wendi.bacon.training/h/cs1pre-processing-with-alevin---answer-key
date: 2024-01-01Finally, to prevent yourself from accidentally changing those tutorial histories, you can Archive History.
- Select galaxy-history-options History Options which is on the top of the list of datasets in the history panel
- Select galaxy-history-archive Archive History
- Select the Archive history button
Your history is now archived! To find it again, you will need to go to Data → Histories → Archived Histories.
Using the new Contributions Annotation framework
If you are writing a tutorial or slides, there are two ways to annotate contributions:
The old way, which doesn’t accurately track roles
contributors: [hexylena, shiltemann]
And the new way which lets you annotate who has helped build a tutorial in a much richer way:
contributions:
authorship:
- shiltemann
- bebatut
editing:
- hexylena
- bebatut
- natefoo
testing:
- bebatut
infrastructure:
- natefoo
translation:
- shiltemann
funding:
- gallantriesThis is especially important if you want to track funding or infrastructure contributions. The old way doesn’t allow for this, and thus we would strongly recommend you use the new format!
Jekyll
Slow incremental builds
Sometimes cleaning Jekyll’s cache can improve slow (~60s) incremental build times.
jekyll clean
will do this. If you continue to experience--incremental
build (make serve-quick
) time issues, please let us know!
Notebooks
Contributing a Jupyter Notebook to the GTN
Problem: I have a notebook that I’d like to add to the GTN.
Solution: While we do not support directly adding notebooks to the GTN, as all of our notebooks are generated from the tutorial Markdown files, there is an alternative path! Instead you can:
- Install
jupytext
- Use it to convert the ipynb file into a Markdown file (
jupytext notebook.ipynb --to markdown
)- Add this Markdown file to the GTN
- Fix any missing header metadata
Then the GTN’s infrastructure will automatically convert that Markdown file directly to a notebook on deployment. This approach has the advantage that Markdown files are more diff-friendly than ipynb, making it much easier to review updates to a tutorial.