Citizen Science Entry 2
Citizen Science Entry 2
Here's my entry for @lemouth's citizen science program, episode 2. Glad to be back :).
Before starting MadGraph, I picked apart the acronyms in the name of the package: 'MadGraph5_aMC@NLO'. Specifically, I wanted to know what the 'aMC' and 'NLO' terms meant. After some searching, here's what I concluded:
- aMC: adjoint Monte Carlo.
- NLO: Next-to-Leading-Order.
I was most interested in the 'aMC' acronym since 'aMC' appears prominently in the MadGraph5 prompt. So it's gotta be important!
Now for the meat-n-potatoes: simulating 10000 proton-proton collisions where each individual collision yields one top quark and one top antiquark. Following lemouth's lead in episode 2, here's the commands I entered.
$ ./bin/mg5_aMC
MG5_aMC>generate p p > t t~ # Task 1. Defines the collider process of interest.
MG5_aMC>display diagrams # Task 1. Writes diagrams to /tmp/ in my container.
MG5_aMC>output pp_tt # Task 2. Set output directory and build Fortran code.
MG5_aMC>launch pp_tt # Task 3. Run simulation and compute top anti-top production rates.
Note the presence of '#' at the end of the MG5 commands above. Through trial and error, I found that '#' indicates a comment in MadGraph's REPL. For example, generate p p > t t~ # Collider process
is a valid command.
An aside on filesystems
Initially I ran the simulation on a shared filesystem (/host/c). That was slow. So I tried both a container-local filesystem and a filesystem mounted in RAM (ramfs). For comparison, I measured the elapsed time of the 'launch' command (minus menu interactions). Here are the results:
- Shared filesystem (/host/c/temp/pp_tt): 14m29s
- Container filesystem (./pp_tt): 6m31s
- RAM filesystem (~/ramfs/pp_tt): 5m47s
The learning: don't use a shared filesystem like I did :(. It was more than 2 times slower. And ramfs helps, but not too significantly. The minor effect of ramfs makes sense as the simulation is likely compute bound.
Since I'm running MadGraph in a container, there is no interactive display session. It's (nearly) headless. So I'm forced to copy graphical results (like images) to the host machine before viewing. That does incur some overhead.
Production Rates and Verification
A screenshot of my final output:
My cross-section is: 505.8 +- 0.8 pb, which yields the same production rate @lemouth determined. Assuming 140/fb has two significant figures.
I verified parton showering, hadronisation, and decay.
That's about it. Looking forward to the next episode and seeing you folks around!
Posted with STEMGeeks
Thanks a lot for this second report, and congratulations for your hard work on this exercise. How long did it take you? I assume you spent a significant amount of time with the docker environment, didn't you?
This time I have plenty of things to comment on!
The name comes from the merging of the code
MadGraph5
(Mad
refers to Madison in the US, andGraph
to Feynman diagrams or graphs) andMC@NLO
(a Monte Carlo event generator achieving predictions at next-to-leading order accuracy in the strong coupling; more information on this are provided in the 6th episode and the upcoming 7th episode). The extraa
in the name refers to automation (MadGraph5
was an automated package for predictions at the leading-order accuracy;MC@NLO
was not automated) .By automation, I mean that it is sufficient to specify the process of interest and the physics model, and the code does the rest.
That’s right. This follows standard Python conventions: anything at the right of the hash is ignored.
That’s interesting. I don’t use dockers (as I run everything locally), so that I am unable to really comment on this. While RAM filesystems seems better, I assume there are limited in disk space, aren’t they? In this case, this may be a weakness.
By the way, why don't you run everything locally (possibly in a virtual environment)?
So you don’t have access to an interactive terminal? That’s definitely an overhead as this makes you unable to read error messages and capture them live (if relevant). So you use it more like a cluster on which you would submit a job and recover the output files after they are transferred locally, don’t you? This makes life complicated for testing purpose...
Once again, congratulations for having achieved this episode!
Cheers!
Thanks!
Most of the container setup time was spent in Episode 1. Now that I have a Dockerfile that specifies how to build the container, I can spin up new instances of the development environment for MadGraph5 quickly. This makes experimenting and iterating with systems-level changes quicker (like playing with filesystems).
The initial procedure of entering commands and checking the results took me a couple hours. Writing the post took me about half a day.
Thanks for the clarifications. :)
Exactly, very limited space. Another downside of ramfs is that you need to preallocate the size of the filesystem. And if the system crashes you lose all your data. If you have enough memory, RAM filesystems are pretty good for processes that generate a lot of intermediate artifacts (like compiling a large programs). But otherwise the downsides outweigh the benefits.
Good question. In this case, I don't have access to a graphical display like an X windows or wayland. So no apps like firefox or gimp. But I do have access to an interactive terminal through SSH. And the container runs locally on my computer. So, thankfully, I can see the MadGraph5 process execute in realtime and enter commands as you would normally at a linux terminal.
Thanks for the clarifications and the opportunity to embark on this fun adventure. Looking forward to the next episode!
I can easily imagine that it is also the case for the other participants. Writing the reports takes always more time than the exercises. I however didn't include that when I mentioned that each episode should take a few hours... I actually didn't even think about it. Baah....
This is what I thought for the RAM filesystem. As for the next-to-next exercises (in a few episodes that I have not written yet), we will need to simulate collisions and store millions of events (which leads to multi-GB intermediate files), I am not sure that this will work. Except of course if the machine is powerful enough. I nevertheless do not know whether it is worth the test.
Then it is then perfect. You can probably have access to the HTML output via a browser like
links
.Cheers, and thanks again for this report!
Thanks for your contribution to the STEMsocial community. Feel free to join us on discord to get to know the rest of us!
Please consider delegating to the @stemsocial account (85% of the curation rewards are returned).
Thanks for including @stemsocial as a beneficiary, which gives you stronger support.
Congratulations @iauns! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s):
Your next target is to reach 1750 upvotes.
You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word
STOP
Check out the last post from @hivebuzz:
Support the HiveBuzz project. Vote for our proposal!
Good effort ! Nice job!
!1UP
Thanks!
You have received a 1UP from @gwajnberg!
@stem-curator
And they will bring !PIZZA 🍕.
Learn more about our delegation service to earn daily rewards. Join the Cartel on Discord.