evka85 / gem_amc Goto Github PK
View Code? Open in Web Editor NEWGEM uTCA AMC common firmware logic as well as board specific implementations for GLIB (Virtex 6) and CTP7 (Virtex 7)
GEM uTCA AMC common firmware logic as well as board specific implementations for GLIB (Virtex 6) and CTP7 (Virtex 7)
@evka85 it's important to have a way to determine which clock is currently doing the TTC decoding, which will determine whether a firmware reload is necessary.
Is there a register currently giving this information? If not, it should be added
Right now when updating the address table used on a host machine it does not have the appropriate XInclude lines to catch the OH address table as described:
This requires a user to undergo tedious actions, e.g. insert the lines by hand each time, when updating address tables.
Specifically:
Remove the lines:
<node id=top>
...
</node>
Then change the GEM_AMC
node declaration too:
<node id='GEM_AMC' xmlns:xi="http://www.w3.org/2001/XInclude">
Finally as shown in the xhal
pull request above, just following the line in the OH${OH_IDX}
node just before the GEB
node add:
<xi:include href="optohybrid_registers.xml"/>
The resulting part of xml should look like this:
<node id="OH${OH_IDX}" address="0x0"
description="Optohybrid ${OH_IDX}"
generate="true" generate_size="2" generate_address_step="0x00010000" generate_idx_var="OH_IDX">
<!--Insert here the OH FPGA module -->
<xi:include href="optohybrid_registers.xml"/>
<node id="GEB" address="0x100000"
description="VFAT3 registers">
Could the next release of AMC FW include this update to the address table as was requested in the past?
Issue exists in: GEM_AMC release v3.4.3.
When programming OHv3b from CTP7 I noticed the following:
eagle60:~/apps/reg_interface$ python sca.py 0x1 program-fpga bit ~/oh_fw/OH-20180223-3.0.10.A.bit
This will program FW of link 0 but wipes FW of link 1.
eagle60:~/apps/reg_interface$ python sca.py 0x2 program-fpga bit ~/oh_fw/OH-20180223-3.0.10.A.bit
Programs FW of link 1 but wipes FW of link 0.
eagle60:~/apps/reg_interface$ python sca.py 0x3 program-fpga bit ~/oh_fw/OH-20180223-3.0.10.A.bit
Will wipe FW and program FPGA's of both link 0 and 1.
I would have expected the wiping of the FPGA firmware to follow the oh mask, e.g. if a link is not included the FW will not be wiped.
Following discussion on Slides 2-5 would like to drop BxmAvO
from the GEM Event Data format and any corresponding status bit/flag registers in the FW.
Propose to add BxmVvV
to represent BX mismatch VFATX vs VFATY where for all X != Y
a check on that VFATX BX is made and compared against VFATY BX. If this is not true a flag is raised.
Also add a FW flag AMC_SEES_VFATX_VFATY_BX_MISMATCH
which would be raised high if a two or more VFATs had mismatching BX's. Then in the DAQ.STATUS
node I would propose to add a LAST_VFAT_BX_SENT
register which for each L1A would store the last BX that was sent by the VFAT so if AMC_SEES_VFATX_VFATY_BX_MISMATCH
was raised high you could check which VFATs were showing the mismatch.
Currently the CTP7 FW doesn't allow to promlessly program the GE2/1 optohybrid without executing special SCA command. As a consequence, the special script has to be used and the procedure is slightly overcompilcated.
OH promless programming should be triggered by sending a hard reset from the CTP7 (or TTC stream)
OH programming doesn't happen upon receiving a hard reset.
After upgrading the firmwares to the latest versions (CTP7 - 3.8.3 ; OH - 3.2.3C), the Sbit monitor started to report wrong clusters even if the trigger links are properly initialized. More precisely, the least significant byte is always set to 0xff
.
On the other hand, the Sbit monitor implemented within the OH reports the expected cluster (see CLUSTER0
) :
eagle63 > kw GEM_AMC.TRIGGER.SBIT_MONITOR
0x66000204 rw GEM_AMC.TRIGGER.SBIT_MONITOR.OH_SELECT 0x00000000
0x66000208 r GEM_AMC.TRIGGER.SBIT_MONITOR.CLUSTER0 0x000060ff
0x6600020c r GEM_AMC.TRIGGER.SBIT_MONITOR.CLUSTER1 0x000007ff
0x66000210 r GEM_AMC.TRIGGER.SBIT_MONITOR.CLUSTER2 0x000007ff
0x66000214 r GEM_AMC.TRIGGER.SBIT_MONITOR.CLUSTER3 0x000007ff
0x66000218 r GEM_AMC.TRIGGER.SBIT_MONITOR.CLUSTER4 0x000007ff
0x6600021c r GEM_AMC.TRIGGER.SBIT_MONITOR.CLUSTER5 0x000007ff
0x66000220 r GEM_AMC.TRIGGER.SBIT_MONITOR.CLUSTER6 0x000007ff
0x66000224 r GEM_AMC.TRIGGER.SBIT_MONITOR.CLUSTER7 0x000007ff
0x66000228 r GEM_AMC.TRIGGER.SBIT_MONITOR.L1A_DELAY 0x07b36daa
eagle63 > kw GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR
0x65008240 None GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR
0x65008240 w GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR.RESET
0x65008244 r GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR.CLUSTER0 0x00006000
0x65008248 r GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR.CLUSTER1 0x000007ff
0x6500824c r GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR.CLUSTER2 0x000007ff
0x65008250 r GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR.CLUSTER3 0x000007ff
0x65008254 r GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR.CLUSTER4 0x000007ff
0x65008258 r GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR.CLUSTER5 0x000007ff
0x6500825c r GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR.CLUSTER6 0x000007ff
0x65008260 r GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR.CLUSTER7 0x000007ff
0x65008280 r GEM_AMC.OH.OH0.FPGA.TRIG.SBIT_MONITOR.L1A_DELAY 0x975f6ddf
I'm not sure if the issue is in the OH or the CTP7. My current workaround uses the OH Sbit monitor instead of the CTP7 one.
When testing the new CTP7 firmware release (3.7.0), I noticed that the trigger links are always reported in an healthy state even if the fibers are not connected or the OH FPGA is not programmed.
In the following case, none of the OH10 fibers were connected :
eagle63 > write GEM_AMC.TRIGGER.CTRL.CNT_RESET 1
Initial value to write: 1, register GEM_AMC.TRIGGER.CTRL.CNT_RESET
0x00000001(1) written to GEM_AMC.TRIGGER.CTRL.CNT_RESET
eagle63 > kw GEM_AMC.TRIGGER.OH10.LINK
0x66002e80 r GEM_AMC.TRIGGER.OH10.LINK0_SBIT_OVERFLOW_CNT 0x00000000
0x66002e80 r GEM_AMC.TRIGGER.OH10.LINK1_SBIT_OVERFLOW_CNT 0x00000000
0x66002e84 r GEM_AMC.TRIGGER.OH10.LINK0_MISSED_COMMA_CNT 0x00000000
0x66002e84 r GEM_AMC.TRIGGER.OH10.LINK1_MISSED_COMMA_CNT 0x00000000
0x66002e8c r GEM_AMC.TRIGGER.OH10.LINK0_OVERFLOW_CNT 0x00000000
0x66002e8c r GEM_AMC.TRIGGER.OH10.LINK1_OVERFLOW_CNT 0x00000000
0x66002e90 r GEM_AMC.TRIGGER.OH10.LINK0_UNDERFLOW_CNT 0x00000000
0x66002e90 r GEM_AMC.TRIGGER.OH10.LINK1_UNDERFLOW_CNT 0x00000000
0x66002e94 r GEM_AMC.TRIGGER.OH10.LINK0_SYNC_WORD_CNT 0x00000000
0x66002e94 r GEM_AMC.TRIGGER.OH10.LINK1_SYNC_WORD_CNT 0x00000000
I'll test test if the trigger data content is correctly received, but the status seems wrong. Yet, I would have expected that some of the error counters increase when the trigger links are not connected or the OH FPGA is not programmed.
As discussed with @evka85, and in cms-gem-daq-project/gemctp7user#41, creating an RPM package for the scripts in this repository that are necessary to install on the CTP7 is the best way forward.
In addition, the bitfile and address table can be packaged in this way, and versioned dependencies with, e.g., ctp7_modules
can be enforced
Makefile
with a make rpm
target, and a simple spec
template.In the v3 GEM Event Format it shows that in the Chamber Payload
the BxmAvV
and OOScAvV
flags are always set to 0 in the current version of the FW.
I was hoping that these could no longer be always set to 0 but instead be actually counted. This would be really useful in DQM and trying to understand the data coming from QC8.
Seems like there are still a few bugs in the UHAL address table that where not mentioned in issue #37.
$ vfat_info_uhal.py --shelf=1 -s2 -g 11
01 Jul 2019 11:59:10.193 [7f163d870740] INFO - optohybrid_user_functions_uhal::getOHObject <> - gem.shelf01.amc02.optohybrid11: Success!
01 Jul 2019 11:59:10.193 [7f163d870740] INFO - optohybrid_user_functions_uhal::getOHObject <> - gem.shelf01.amc02.optohybrid11: Success!
--=======================================--
Firmware: Version Date
AMC : 3.7.2 23/5/2017
Traceback (most recent call last):
File "/opt/cmsgemos/bin/vfat_info_uhal.py", line 84, in <module>
print "OH : %10s %10s"%(getFirmwareVersion(ohboard,options.gtx),
File "/usr/lib/python2.7/site-packages/gempython/tools/optohybrid_user_functions_uhal.py", line 86, in getFirmwareVersion
fwver = readRegister(device,"%s.RELEASE.VERSION"%(baseNode),debug)
File "/usr/lib/python2.7/site-packages/gempython/utils/registers_uhal.py", line 35, in readRegister
device.getNode(register).getPath(),
uhal._core.exception: No branch found with ID-path "GEM_AMC.OH.OH11.FPGA.CONTROL.RELEASE.VERSION" from node "top"
Partial match "GEM_AMC.OH.OH11" found for ID-path "GEM_AMC.OH.OH11.FPGA.CONTROL.RELEASE.VERSION" from node "top"
@jsturdy points out that:
FPGA node is missing from uhal address tables
This was observed in FW release 3.8.3.
Not sure if this was resolved in 3.8.4 as well.
@evka85 ?
When testing the new CTP7 firmware release (3.7.0), I noticed that the address tables present with the bitfile release do not contain the 12OH's, but only 4.
This can also be seen in the repository code :
GEM_AMC/scripts/address_table/gem_amc_top.xml
Lines 1283 to 1285 in 2c7c6b8
After correcting the "new address table", as it is named is the archive, I was able to communicate with the 12OH's with the gem_reg.py
tool.
Right now for example on eagle60
we have divergent behaviour of apps/reg_interface
see for example:
drwxr-sr-x 5 texas texas 4096 Feb 22 00:00 apps
drwxr-sr-x 5 texas texas 4096 Feb 16 2017 apps_NOT_WORKING_MISHAS
drwxr-sr-x 7 texas texas 4096 Feb 16 2017 apps_bkp
Right now it's unclear which set of scripts should be used or supported and how a developer (but non-expert) can use a CTP7 without intervention by @mexanick or @evka85 for tasks (like updating the lmdb).
Additionally the expect format of the address tables has diverged and this is related too #20.
To standardize things and continue central support @evka85 could you:
xhal
repository,xhal
repository via this template, andcms-gem-daq-project
umbrella project so that other developers can take advantage the infrastructure setup there.The uhal
generated address table has a typo:
<node id="GBT" address="0x0" permission="rw" mode="block"size="3312"/>
which should be (space between the "
closing the mode
and the next key size
):
<node id="GBT" address="0x0" permission="rw" mode="block" size="3312"/>
General Description of Task (as I have understood):
The GBTx can have its firmware rewritten in a variety of ways, but the method we are concerned with is Internal Command (IC). The firmware required to communicate and write firmware on particular GBT has already been handled in common/hdl/slow_control/gbtx_ic_controller.vhd. Furthermore a means of read information off of the sca (which could be structurally similar to the means of reading off from the GBTx with the HDLC protocol) have also been written in slow_control.
Resources:
GBT manual: https://espace.cern.ch/GBT-Project/GBTX/Manuals/gbtxManual.pdf
Any manual in the same realm of discussion (including SCA stuff):https://espace.cern.ch/GBT-Project/GBTX/Manuals/Forms/AllItems.aspx
The current steps taken by our group are purely introductory; understanding how frames work as well as downloading a firmware emulator and designing the relevant test bench. For this purpose it would be useful to obtain some real life data, as discussed yesterday.
I expect in pursuit of this goal, we will not make a new vdh file,instead including the read capability in gbt_ic_controller.vhd.
Right now the GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN
defaults to 0x1
for FW versions 3.X.Y
.
Whenever a TTC hard reset is sent by CMS in P5 this is causing an SCA reset which results in lost firmware on the OH. Can this above register be defaulted to 0x0
?
In the gemctp7user
tools repo we rely on being able to specify a release version and getting the corresponding files.
Currently, this is muddled by mixing different era files within a given "release".
While understandable from a pre-release perspective, it would be best if when a release becomes "final" it always has the same convention.
The expectation (based on previous releases) is that this will always work:
then
echo "CTP7 firmware fw/gem_ctp7_v${ctp7fw//./_}_GBT.bit missing, downloading"
echo "wget https://github.com/evka85/GEM_AMC/releases/download/v${ctp7fw}/gem_ctp7_v${ctp7fw//./_}_GBT.bit -O fw/gem_ctp7_v${ctp7fw//./_}_GBT.bit"
wget https://github.com/evka85/GEM_AMC/releases/download/v${ctp7fw}/gem_ctp7_v${ctp7fw//./_}_GBT.bit -O fw/gem_ctp7_v${ctp7fw//./_}_GBT.bit
fi
echo "ln -sf gem_ctp7_v${ctp7fw//./_}_GBT.bit fw/gem_ctp7.bit"
ln -sf gem_ctp7_v${ctp7fw//./_}_GBT.bit fw/gem_ctp7.bit
if [ ! -f "xml/gem_amc_top_${ctp7fw//./_}.xml" ]
then
echo "CTP7 firmware xml/gem_amc_top_${ctp7fw//./_}.xml missing, downloading"
echo "wget https://github.com/evka85/GEM_AMC/releases/download/v${ctp7fw}/address_table_v${ctp7fw//./_}_GBT.zip"
wget https://github.com/evka85/GEM_AMC/releases/download/v${ctp7fw}/address_table_v${ctp7fw//./_}_GBT.zip
echo "unzip address_table_v${ctp7fw//./_}_GBT.zip"
unzip address_table_v${ctp7fw//./_}_GBT.zip
echo "cp address_table_v${ctp7fw//./_}_GBT/gem_amc_top.xml xml/gem_amc_v${ctp7fw//./_}.xml"
cp address_table_v${ctp7fw//./_}_GBT/gem_amc_top.xml xml/gem_amc_v${ctp7fw//./_}.xml
echo "rm -rf address_table_v${ctp7fw//./_}_GBT"
rm -rf address_table_v${ctp7fw//./_}_GBT
fi
If some of this is obsolete (i.e.,, using the _GBT
suffix), that's fine, but the general structure needs to be consistent (see also cms-gem-daq-project/xhal#27)
Looking at the release notes for 3.8.4 I see that:
SCA TTC_HARD_RESET_EN is now an OH mask instead of just one bit
If I call recover.sh
on the card I get the following error:
Set ignore TTC hard resets
reg_interface.py -e write GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN 0
Extended prompt module found
Open pickled address table if available /mnt/persistent/gemdaq/xml/gem_amc_top.pickle...
Initial value to write: 0, register GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN
Writing masked reg GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN failed. Exiting...
wReg output -2
0xfffffffe(0) written to GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN
Then trying to write it manually from reg_interface.py on the card unsurprisingly fails:
eagle26:/mnt/persistent/gemdaq$ reg_interface.py
Extended prompt module found
Open pickled address table if available /mnt/persistent/gemdaq/xml/gem_amc_top.pickle...
Starting CTP7 Register Command Line Interface. Please connect to CTP7 using connect <hostname> command unless you use it directly at the CTP7
CTP7 > doc GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN
Name: GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN
Description: When this is set to 1 (default), TTC hard reset commands are forwarded to the SCA to reset the OH FPGA
Address: 0x00b00002
Permission: rw
Mask: 0x00000001
Module: False
Parent: GEM_AMC.SLOW_CONTROL.SCA.CTRL
None
CTP7 > write GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN 0
Initial value to write: 0, register GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN
Writing masked reg GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN failed. Exiting...
wReg output -2
0xfffffffe(0) written to GEM_AMC.SLOW_CONTROL.SCA.CTRL.TTC_HARD_RESET_EN
Moreover the register documentation does not match the fact that this is now a 12-bit register.
Perhaps this is due to an outdated pickle file? I will try this from the DAQ machine once I have finished my FW update.
For the record I'm using GEM_AMC 3.8.6
The current (3.8.3) address tables for uhal
introduce syntax errors (in addition to those introduced in #24):
uhal._core.exception: Bit-masked nodes are not allowed to have child nodes
In the following, MGT_CHANNEL_63.RESET
can not have a mask because it has child nodes (and many other places in this new block OPTICAL_LINKS
)
<node id="MGT_CHANNEL_63" address="0xfc0" >
<node id="RESET" address="0x1" permission="rw" mask="0x3">
<node id="TX_RESET" address="0x0" permission="rw" mask="0x1"/>
<node id="RX_RESET" address="0x0" permission="rw" mask="0x2"/>
</node>
<node id="CTRL" address="0x2" permission="rw" mask="0x7f">
<node id="TX_POWERDOWN" address="0x0" permission="rw" mask="0x1"/>
<node id="RX_POWERDOWN" address="0x0" permission="rw" mask="0x2"/>
<node id="TX_POLARITY" address="0x0" permission="rw" mask="0x4"/>
<node id="RX_POLARITY" address="0x0" permission="rw" mask="0x8"/>
<node id="LOOPBACK" address="0x0" permission="rw" mask="0x10"/>
<node id="TX_INHIBIT" address="0x0" permission="rw" mask="0x20"/>
<node id="RX_LOW_POWER_MODE" address="0x0" permission="rw" mask="0x40"/>
<node id="RX_PRBS_SEL" address="0x1" permission="rw" mask="0x7"/>
<node id="TX_PRBS_SEL" address="0x1" permission="rw" mask="0x70"/>
<node id="PRBS_CNT_RESET" address="0x2" permission="w" mask="0x1"/>
<node id="RX_ERROR_CNT_RESET" address="0x3" permission="w" mask="0x1"/>
</node>
<node id="STATUS" address="0x0" permission="r" mask="0xf">
<node id="TX_RESET_DONE" address="0x0" permission="r" mask="0x1"/>
<node id="RX_RESET_DONE" address="0x0" permission="r" mask="0x2"/>
<node id="CPLL_LOCKED" address="0x0" permission="r" mask="0x4"/>
<node id="CPLL_REF_CLK_LOST" address="0x0" permission="r" mask="0x8"/>
<node id="PRBS_ERROR_CNT" address="0x4" permission="r" />
<node id="PRBS_ERROR_CNT" address="0x4" permission="r" />
<node id="RX_NOT_IN_TABLE_CNT" address="0x5" permission="r" />
<node id="RX_DISPERR_CNT" address="0x6" permission="r" />
</node>
</node>
</node>
The ctp7/apps/* can be removed. All the code is stored elsewhere now. The only remaining thing is where do we keep the rw_reg.py
For the v2b electronics the address table goes as:
"GEM_AMC.OH.OH%d.GEB.VFATS.VFAT%d"%(gtx,chip)
For the v3 electronics the address table goes as:
"GEM_AMC.OH.OH%d.GEB.VFAT%d"%(gtx,chip)
In both cases gtx
=OH link number and chip
=vfat number.
Request to have the v3 electronics address table format conform to the v2b address table format. This will simplify the task of writing SW that is compatible with both v2b and v3 HW.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.