💾 Archived View for thebird.nl › gn-gemtext-threads › topics › data-uploads › editing-data.gmi captured on 2023-04-26 at 13:18:30. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2022-04-28)
-=-=-=-=-=-=-
At the moment, you can edit metadata related to a published phenotype and a probeset. When an edit is done, the diff data is stored in a table, `metadata_audit` in json format that looks something like:
{"Probeset": {"id_": {"old": "108845", "new": "108845"}, "description": {"old": "EST BG068973 (olfactory receptor related sequence).", "new": "EST BG068973 (olfactory receptor related sequence). Test"}}, "probeset_name": "1446112_at", "author": "Bonface Munyoki K.", "timestamp": "2021-07-10 08:31:18"} {"Probeset": {"id_": {"old": "108845", "new": "108845"}, "description": {"old": "EST BG068973 (olfactory receptor related sequence). Test", "new": "EST BG068973 (olfactory receptor related sequence). Testing"}}, "probeset_name": "1446112_at", "author": "Bonface Munyoki K.", "timestamp": "2021-07-10 08:31:36"} {"Probeset": {"id_": {"old": "108845", "new": "108845"}, "description": {"old": "EST BG068973 (olfactory receptor related sequence). Testing", "new": "EST BG068973 (olfactory receptor related sequence)."}}, "probeset_name": "1446112_at", "author": "Bonface Munyoki K.", "timestamp": "2021-07-10 10:15:37"} {"Phenotype": {"pre_pub_description": {"old": "Testing the description", "new": "Testing the description"}, "post_pub_description": {"old": "Central nervous system, morphology: Internal granule layer (IGL) of the cerebellum volume, adjusted for sex, age, body and brain weight [mm3]", "new": "Central nervous system, morphology: Internal granule layer (IGL) of the cerebellum volume, adjusted for sex, age, body and brain weight [mm3]"}, "original_description": {"old": "Original post publication description: Central nervous system, morphology: Internal granule layer (IGL) of the cerebellum volume, adjusted for sex, age, body and brain weight [mm3]", "new": "Original post publication description: Central nervous system, morphology: Internal granule layer (IGL) of the cerebellum volume, adjusted for sex, age, body and brain weight [mm3]"}, "units": {"old": "mm3", "new": "mm3"}, "pre_pub_abbreviation": {"old": "", "new": ""}, "post_pub_abbreviation": {"old": "AdjIGLVol", "new": "AdjIGLVol"}, "lab_code": {"old": "", "new": ""}, "submitter": {"old": "robwilliams", "new": "robwilliams"}, "owner": {"old": "", "new": ""}, "authorized_users": {"old": "robwilliams", "new": "robwilliams"}}, "dataset_id": "10007", "author": "Bonface Munyoki K.", "timestamp": "2021-07-16 16:35:56"} {"Phenotype": {"pre_pub_description": {"old": "Testing the description", "new": "Testing the descriptiony(test)"}}, "dataset_id": "10007", "author": "Bonface Munyoki K.", "timestamp": "2021-07-16 18:25:15"}
Note how we use one table to store different diffs from different data types. This is important, since if the need ever arises to use a different schema or db altogether, we only use one column.
First, run this sql script to add the "metadata<sub>audit</sub>" table:
Example
mysql -u webqtlout db_webqtl < metadata_audit.sql
And check this works
select * FROM information_schema.COLUMNS WHERE table_schema=DATABASE() AND TABLE_NAME='metadata_audit';
For everything to run, use the latest commit on the main branch from gn3. One way to make sure that you are running the latest changes, make sure that you have genenetwork3 cloned somewhere in your path; and keep it updated. Thereafter set, a PATH in "bin/genenetwork2" which should look something similar to `export PYTHONPATH=/home/bonface/projects/genenetwork3:$PYTHON_GN_PATH:$GN2_BASE_DIR/wqflask:$PYTHONPATH`.
Alternatively, install gn3 in it's own profile. This can be sometimes rinconvenient since the latest changes from gn3 can be ahead of the the package definition in our channel.
These are the steps you can do to edit a phenotype/ probeset:
- Log in to a trait page or a probeset (Make sure that your are logged in). From the "Trait Data and Analysis page" for the probeset/ trait, there's an "Edit" button under "Details and Links". Use this to edit the metadata.
- An example of editing a published phenotype is: "*trait/10007/edit/inbredset-id/10007*"
After editing traits, you could see changes in a diff format at the very top of the update page.
- Atm, creds are hard-coded. See:
https://github.com/genenetwork/genenetwork2/blob/testing/wqflask/wqflask/decorators.py
That needs to be improved.
- For Probeset data, atm some tables aren't yet updated(unfortunately those tables aren't used anywhere). I need to figure that out.
- Fixed excel issue described here:
Consider the case where a user deletes, updates, or modifies data from
Genenetwork using the csv download they got from the data-upload page.
In the event they do not approve the data, and they upload that csv
file again, we have a case in gn2 where data needs to be deleted,
inserted, or modified twice. In the case of deletions, or insertions,
an error will be generated because you cannot remove or insert that
same data twice. Also, the count that's displayed as a flash message
will not be correct (visually) since-- at the time of this writing--
we count the number of edits from a CSV file. A working patch will be
sent later today(or tomorrow).
Also, the GN2 build is not working. With that in mind below I outline
how I've plugged in pudb, inspired by jgarte's demo:
https://git.genenetwork.org/jgart/qc_crlf_demo
- First, in *wqflask/runserver*, I removed Flask DebugToolbar(this
somehow made Flask crash, and I haven't investigated it). Here is
how the diff looks:
modified wqflask/runserver.py @@ -23,9 +23,7 @@ app_config() werkzeug_logger = logging.getLogger('werkzeug') if WEBSERVER_MODE == 'DEBUG': - from flask_debugtoolbar import DebugToolbarExtension app.debug = True - toolbar = DebugToolbarExtension(app) app.run(host='0.0.0.0', port=SERVER_PORT, debug=True,
- As a work-around for the failing gn2 build, install pudb in it's own profile
guix install pudb -p ~/opt/pudb
- Add pudb to your python path in *bin/genenetwork2* as show in this
diff:
-# We may change this one: -export PYTHONPATH=$PYTHON_GN_PATH:$GN2_BASE_DIR/wqflask:$GN3_PYTHONPATH:$PYTHONPATH +# We may change this one: +export PYTHONPATH=$GN2_BASE_DIR/wqflask:/home/bonface/opt/pudb/lib/python3.9/site-packages/:/home/bonface/projects/genenetwork3:$PYTHONPATH
- To use pudb, just stick `import pudb; pu.db` wherever you fancy.
Happy debugging!
Fixed precision point errors. During file upload, excel appended
extra values; so instead of `BXD1,18.8,x,x`, you'd have
`BXD1,18.8000001,20,x`. There are 2 parts to fixing this:
- During generating the json file(when creating the diff), ignore
modifications where |ε| > 0.0001.
- When listing diffs, ignore modification entries where |ε| > 0.0001.
For this fix, the second bit was ignored, because it'll lead to
unnecessarily complicated code, given that only one person has really
been testing the system.