e6863fd4 | 16-May-2023 |
Xuan Hu <[email protected]> |
dispatch: add vector preg allocation |
b6b11f60 | 22-May-2023 |
Xuan Hu <[email protected]> |
backend: add vector related datapath and configs |
3f6c8c2c | 10-May-2023 |
Xuan Hu <[email protected]> |
Merge branch 'dev-vector' into new-backend |
4255f8a9 | 20-Apr-2023 |
Xuan Hu <[email protected]> |
Merge remote-tracking branch 'upstream/master' into new-backend-merge-master |
d8aa3d57 | 20-Apr-2023 |
bugGenerator <[email protected]> |
perf: add some slot util perf counters of id/rn/dp (#2046) |
67fcf090 | 18-Apr-2023 |
Xuan Hu <[email protected]> |
Merge remote-tracking branch 'upstream/master' into new-backend |
730cfbc0 | 16-Apr-2023 |
Xuan Hu <[email protected]> |
backend: merge v2backend into backend |
124bf66a | 12-Apr-2023 |
Xuan Hu <[email protected]> |
backend,Core: remove dead code and comments |
6ed1154e | 26-Mar-2023 |
Tang Haojin <[email protected]> |
top-down: add rob head type into consideration (#1999)
* top-down: add rob head type into consideration
* top-down: put counters into EnableTopDown scope |
3b739f49 | 06-Mar-2023 |
Xuan Hu <[email protected]> |
v2backend: huge tmp commit |
0f038924 | 16-Jan-2023 |
ZhangZifei <[email protected]> |
backend,vector: fix vector relative bug and first vadd instr success
Modification and Bugs includes: 1. readFpRf/writeFpRf is replaced with readFpVecRf/writeFpVecRf in some places; 2. fpWen is repla
backend,vector: fix vector relative bug and first vadd instr success
Modification and Bugs includes: 1. readFpRf/writeFpRf is replaced with readFpVecRf/writeFpVecRf in some places; 2. fpWen is replaced with fpVecWen in some places; 3. add ADD/SUB decode info 4. dispatch logic modification 5. dataWidth & wakeup logic in rs 6. ExuInput/ExuOutput at many places 7. fuSel inside FUBlock of FMAC 8. FuType encoding 9. many other bugs
show more ...
|
9ab1568e | 15-Nov-2022 |
czw <[email protected]> |
rs: mv rf-read from dispatch2rs to rs-select(asyn read regfile now)
chore(*): Change Sequential Parameter Pass to Parameter Name Parameter Passing
refactor(Regfile): Modify Synchronous Read to Asyn
rs: mv rf-read from dispatch2rs to rs-select(asyn read regfile now)
chore(*): Change Sequential Parameter Pass to Parameter Name Parameter Passing
refactor(Regfile): Modify Synchronous Read to Asynchronous Read
refactor(Scheduler, ReservationStationBase): Connect the asynchronous read port of the register and the reserved station
1. add parameter( numIntRfReadPorts, numFpRfReadPorts, params.exuCfg) 2. fix extractReadRf 3. remove dataArray and add dataArrayWrite, dataArrayMultiWrite, s1_out_addr 4. add immBypassedData2 for bypass and fix DataSelect
refactor(ReservationStationStd): fix connect between s1_deqRfDataSel and readFpRf_asyn(i).data
refactor(ReservationStationJump): add jalrMem and fix immExts connect
show more ...
|
3c02ee8f | 25-Dec-2022 |
wakafa <[email protected]> |
Separate Utility submodule from XiangShan (#1861)
* misc: add utility submodule
* misc: adjust to new utility framework
* bump utility: revert resetgen
* bump huancun |
eb163ef0 | 17-Nov-2022 |
Haojin Tang <[email protected]> |
top-down: introduce top-down counters and scripts (#1803)
* top-down: add initial top-down features
* rob600: enlarge queue/buffer size
* :art: After git pull
* :sparkles: Add BranchResteer
top-down: introduce top-down counters and scripts (#1803)
* top-down: add initial top-down features
* rob600: enlarge queue/buffer size
* :art: After git pull
* :sparkles: Add BranchResteers->CtrlBlock
* :sparkles: Cg BranchResteers after pending
* :sparkles: Add robflush_bubble & ldReplay_bubble
* :ambulance: Fix loadReplay->loadReplay.valid
* :art: Dlt printf
* :sparkles: Add stage2_redirect_cycles->CtrlBlock
* :saprkles: CtrlBlock:Add s2Redirect_when_pending
* :sparkles: ID:Add ifu2id_allNO_cycle
* :sparkles: Add ifu2ibuffer_validCnt
* :sparkles: Add ibuffer_IDWidth_hvButNotFull
* :sparkles: Fix ifu2ibuffer_validCnt
* :ambulance: Fix ibuffer_IDWidth_hvButNotFull
* :sparkles: Fix ifu2ibuffer_validCnt->stop
* feat(buggy): parameterize load/store pipeline, etc.
* fix: use LoadPipelineWidth rather than LoadQueueSize
* fix: parameterize `rdataPtrExtNext`
* fix(SBuffer): fix idx update logic
* fix(Sbuffer): use `&&` to generate flushMask instead of `||`
* fix(atomic): parameterize atomic logic in `MemBlock`
* fix(StoreQueue): update allow enque requirement
* chore: update comments, requirements and assertions
* chore: refactor some Mux to meet original logic
* feat: reduce `LsMaxRsDeq` to 2 and delete it
* feat: support one load/store pipeline
* feat: parameterize `EnsbufferWidth`
* chore: resharp codes for better generated name
* top-down: add initial top-down features
* rob600: enlarge queue/buffer size
* top-down: add l1, l2, l3 and ddr loads bound perf counters
* top-down: dig into l1d loads bound
* top-down: move memory related counters to `Scheduler`
* top-down: add 2 Ldus and 2 Stus
* top-down: v1.0
* huancun: bump HuanCun to a version with top-down
* chore: restore parameters and update `build.sc`
* top-down: use ExcitingUtils instead of BoringUtils
* top-down: add switch of top-down counters
* top-down: add top-down scripts
* difftest: enlarge stuck limit cycles again
Co-authored-by: gaozeyu <[email protected]>
show more ...
|
b56f947e | 18-Jul-2022 |
Yinan Xu <[email protected]> |
ftq,ctrl: add copies for pc and jalr_target data modules (#1661)
* ftq, ctrl: remove pc/target backend read ports, and remove redirectGen in ftq
* ctrl: add data modules for pc and jalr_target
ftq,ctrl: add copies for pc and jalr_target data modules (#1661)
* ftq, ctrl: remove pc/target backend read ports, and remove redirectGen in ftq
* ctrl: add data modules for pc and jalr_target
This commit adds two data modules for pc and jalr_target respectively.
They are the same as data modules in frontend. Should benefit timing.
* jump: reduce pc and jalr_target read latency
* ftq: add predecode redirect update target interface, valid only on ifuRedirect
* ftq, ctrl: add second write port logic of jalrTargetMem, and delay write of pc/target mem for two cycles
Co-authored-by: Lingrui98 <[email protected]>
show more ...
|
fd09b64a | 13-Jul-2022 |
Yinan Xu <[email protected]> |
dispatch2: optimize slow path and enqPtr matching timing (#1650)
* dpq: add slow path for non-critical registers
This commit separates the data module in Dispatch to slow and fast path.
Slow pat
dispatch2: optimize slow path and enqPtr matching timing (#1650)
* dpq: add slow path for non-critical registers
This commit separates the data module in Dispatch to slow and fast path.
Slow path stores the data with a bad timing at Dispatch but a good timing
at Dispatch2. Thus should benefit the timing at Dispatch, such as the LFST.
For now, we merge the slow and fast data module. Chisel DCE does not
eliminate the dead registers. We manully merge the two data modules
for now.
* dpq: optimize timing for enqPtr/deqPtr matching
This commit optimizes the matching timing between enqPtr and deqPtr,
which is used further for bypassing enqData to deqData.
Now enqOffset and deqPtr/enqPtr matching work in parallel.
show more ...
|
1cee9cb8 | 12-Jul-2022 |
Yinan Xu <[email protected]> |
ctrl: optimize the timing of dispatch2 stage (#1632)
* ctrl: copy dispatch2 to avoid cross-module loops
This commit makes copies of dispatch2 in CtrlBlock to avoid long
cross-module timing loop
ctrl: optimize the timing of dispatch2 stage (#1632)
* ctrl: copy dispatch2 to avoid cross-module loops
This commit makes copies of dispatch2 in CtrlBlock to avoid long
cross-module timing loop paths. Should be good for timing.
* dpq: re-write queue read logic
This commit adds a Reg-Vec to store the queue read data. Since
most queues read at most the current numRead and the next numRead
entries, the read timing can be optimized by reading the data one
cycle earlier.
show more ...
|
0dc4893d | 10-Jul-2022 |
Yinan Xu <[email protected]> |
core: optimize redirect timing (#1630)
This commit adds separated redirect registers in ExuBlock and MemBlock.
They have one cycle latency compared to redirect in CtrlBlock. This will
help reduce
core: optimize redirect timing (#1630)
This commit adds separated redirect registers in ExuBlock and MemBlock.
They have one cycle latency compared to redirect in CtrlBlock. This will
help reduce the fanout of redirect registers.
show more ...
|
00210c34 | 06-Jul-2022 |
Yinan Xu <[email protected]> |
dpq: optimize read and write timing of data module (#1610)
This commit changes the data modules in Dispatch Queue. We use one-hot
indices to read and write the data array. |
fa9d712c | 27-Jun-2022 |
Yinan Xu <[email protected]> |
dp2: add a pipeline for load/store (#1597)
* dp2: add a pipeline for load/store
Load/store Dispatch2 has a bad timing because it requires the fuType
to disguish the out ports. This brings timing
dp2: add a pipeline for load/store (#1597)
* dp2: add a pipeline for load/store
Load/store Dispatch2 has a bad timing because it requires the fuType
to disguish the out ports. This brings timing issues because the
instruction has to read busyTable after the port arbitration.
This commit adds a pipeline in dp2Ls, which may cause performance
degradation. Instructions are dispatched according to out, and at
the next cycle it will leave dp2.
* bump difftest trying to fix vcs
show more ...
|
a19215dd | 18-Jun-2022 |
Yinan Xu <[email protected]> |
decode: do not set lsrc of LUI for better timing (#1586)
This commit changes the lsrc/psrc of LUI in dispatch instead of
decode to optimize the timing of lsrc in DecodeStage, which is
critical for
decode: do not set lsrc of LUI for better timing (#1586)
This commit changes the lsrc/psrc of LUI in dispatch instead of
decode to optimize the timing of lsrc in DecodeStage, which is
critical for rename table.
lsrc/ldest should be directly get from instr for the timing. Fused
instructions change lsrc/ldest now, which will be optimized later.
show more ...
|
25ac26c6 | 11-May-2022 |
William Wang <[email protected]> |
Fix vcs simulation support, support manually set ram_size (#1551)
* difftest: disable runahead to make vcs happy
* difftest: bump huancun to make vcs happy
* difftest: bump difftest and ready-
Fix vcs simulation support, support manually set ram_size (#1551)
* difftest: disable runahead to make vcs happy
* difftest: bump huancun to make vcs happy
* difftest: bump difftest and ready-to-run
* difftest support ramsize and paddr base config
* 8GB/16GB nemu so are provided by ready-to-run
* ci: update nightly ci, manually set ram_size
* difftest: bump huancun to make vcs happy
* difftest,nemu: support run-time assign mem size
* ci: polish nightly ci script
show more ...
|
46f74b57 | 06-May-2022 |
Haojin Tang <[email protected]> |
feat: parameterize load store (#1527)
* feat: parameterize load/store pipeline, etc.
* fix: use LoadPipelineWidth rather than LoadQueueSize
* fix: parameterize `rdataPtrExtNext`
* SBuffer:
feat: parameterize load store (#1527)
* feat: parameterize load/store pipeline, etc.
* fix: use LoadPipelineWidth rather than LoadQueueSize
* fix: parameterize `rdataPtrExtNext`
* SBuffer: fix idx update logic
* atomic: parameterize atomic logic in `MemBlock`
* StoreQueue: update allow enque requirement
* feat: support one load/store pipeline
* feat: parameterize `EnsbufferWidth`
* chore: resharp codes for better generated name
show more ...
|
5d669833 | 04-May-2022 |
Yinan Xu <[email protected]> |
csr: check WFI and other illegal instructions |
b6900d94 | 28-Apr-2022 |
Yinan Xu <[email protected]> |
core,rob: support the WFI instruction
The RISC-V WFI instruction is previously decoded as NOP. This commit adds support for the real wait-for-interrupt (WFI).
We add a state_wfi FSM in the ROB. Aft
core,rob: support the WFI instruction
The RISC-V WFI instruction is previously decoded as NOP. This commit adds support for the real wait-for-interrupt (WFI).
We add a state_wfi FSM in the ROB. After WFI leaves the ROB, the next instruction will wait in the ROB until an interrupt.
show more ...
|