#
873dc383 |
| 20-Jul-2022 |
Lingrui98 <[email protected]> |
ftq, ctrl: fix newest_target logic, pass it to ctrlblock, remove jalrTargetMem and read target from pc_mem
|
#
700e90ab |
| 18-Jul-2022 |
Yinan Xu <[email protected]> |
ftq,ctrl: add copies for pc and jalr_target data modules (#1661)
* ftq, ctrl: remove pc/target backend read ports, and remove redirectGen in ftq
* ctrl: add data modules for pc and jalr_target
Thi
ftq,ctrl: add copies for pc and jalr_target data modules (#1661)
* ftq, ctrl: remove pc/target backend read ports, and remove redirectGen in ftq
* ctrl: add data modules for pc and jalr_target
This commit adds two data modules for pc and jalr_target respectively. They are the same as data modules in frontend. Should benefit timing.
* jump: reduce pc and jalr_target read latency
* ftq: add predecode redirect update target interface, valid only on ifuRedirect
* ftq, ctrl: add second write port logic of jalrTargetMem, and delay write of pc/target mem for two cycles
Co-authored-by: Lingrui98 <[email protected]>
show more ...
|
#
ccfddc82 |
| 01-Nov-2022 |
Haojin Tang <[email protected]> |
rename: Re-rename instead of walking back after redirect (#1768)
* freelist & refcounter: implement arch states
* walk: restore and walk again when redirecting
* ROB: optimize invalidation of
rename: Re-rename instead of walking back after redirect (#1768)
* freelist & refcounter: implement arch states
* walk: restore and walk again when redirecting
* ROB: optimize invalidation of `valid`
show more ...
|
#
b56f947e |
| 18-Jul-2022 |
Yinan Xu <[email protected]> |
ftq,ctrl: add copies for pc and jalr_target data modules (#1661)
* ftq, ctrl: remove pc/target backend read ports, and remove redirectGen in ftq
* ctrl: add data modules for pc and jalr_target
ftq,ctrl: add copies for pc and jalr_target data modules (#1661)
* ftq, ctrl: remove pc/target backend read ports, and remove redirectGen in ftq
* ctrl: add data modules for pc and jalr_target
This commit adds two data modules for pc and jalr_target respectively.
They are the same as data modules in frontend. Should benefit timing.
* jump: reduce pc and jalr_target read latency
* ftq: add predecode redirect update target interface, valid only on ifuRedirect
* ftq, ctrl: add second write port logic of jalrTargetMem, and delay write of pc/target mem for two cycles
Co-authored-by: Lingrui98 <[email protected]>
show more ...
|
#
6474c47f |
| 14-Jul-2022 |
Yinan Xu <[email protected]> |
rob: optimize timing for commit and walk (#1644)
* rob: separate walk and commit valid bits
* rob: optimize instrCnt timing
* rob: fix blockCommit condition when flushPipe
When flushPipe is
rob: optimize timing for commit and walk (#1644)
* rob: separate walk and commit valid bits
* rob: optimize instrCnt timing
* rob: fix blockCommit condition when flushPipe
When flushPipe is enabled, it will block commits in ROB. However,
in the deqPtrModule, the commit is not blocked. This commit fixes
the issue.
show more ...
|
#
74515c5a |
| 12-Jul-2022 |
Yinan Xu <[email protected]> |
jump: delay pc and jalr_target for one cycle (#1640)
|
#
1cee9cb8 |
| 12-Jul-2022 |
Yinan Xu <[email protected]> |
ctrl: optimize the timing of dispatch2 stage (#1632)
* ctrl: copy dispatch2 to avoid cross-module loops
This commit makes copies of dispatch2 in CtrlBlock to avoid long
cross-module timing loop
ctrl: optimize the timing of dispatch2 stage (#1632)
* ctrl: copy dispatch2 to avoid cross-module loops
This commit makes copies of dispatch2 in CtrlBlock to avoid long
cross-module timing loop paths. Should be good for timing.
* dpq: re-write queue read logic
This commit adds a Reg-Vec to store the queue read data. Since
most queues read at most the current numRead and the next numRead
entries, the read timing can be optimized by reading the data one
cycle earlier.
show more ...
|
#
0dc4893d |
| 10-Jul-2022 |
Yinan Xu <[email protected]> |
core: optimize redirect timing (#1630)
This commit adds separated redirect registers in ExuBlock and MemBlock.
They have one cycle latency compared to redirect in CtrlBlock. This will
help reduce
core: optimize redirect timing (#1630)
This commit adds separated redirect registers in ExuBlock and MemBlock.
They have one cycle latency compared to redirect in CtrlBlock. This will
help reduce the fanout of redirect registers.
show more ...
|
#
0febc381 |
| 09-Jul-2022 |
Yinan Xu <[email protected]> |
decode: move fusion decoder result Mux to rename (#1631)
This commit moves the fusion decoder to both decode and rename stage.
In the decode stage, fusion decoder determines whether the instructi
decode: move fusion decoder result Mux to rename (#1631)
This commit moves the fusion decoder to both decode and rename stage.
In the decode stage, fusion decoder determines whether the instruction
pairs can be fused. Valid bits of decode are not affected by fusion
decoder. This should fix the timing issues of rename.valid.
In the rename stage, some fields are updated according the result of
fusion decoder. This will bring a minor timing path to both valid and
other fields in uop in the rename stage. However, since freelist and
rat have worse timing. This should not cause timing issues.
show more ...
|
#
a0db5a4b |
| 20-Jun-2022 |
Yinan Xu <[email protected]> |
decode: parallel fusion decoder and rat read (#1588)
|
#
005e809b |
| 26-May-2022 |
Jiuyang Liu <[email protected]> |
fix for chipsalliance/chisel3#2496 (#1563)
|
#
25ac26c6 |
| 11-May-2022 |
William Wang <[email protected]> |
Fix vcs simulation support, support manually set ram_size (#1551)
* difftest: disable runahead to make vcs happy
* difftest: bump huancun to make vcs happy
* difftest: bump difftest and ready-
Fix vcs simulation support, support manually set ram_size (#1551)
* difftest: disable runahead to make vcs happy
* difftest: bump huancun to make vcs happy
* difftest: bump difftest and ready-to-run
* difftest support ramsize and paddr base config
* 8GB/16GB nemu so are provided by ready-to-run
* ci: update nightly ci, manually set ram_size
* difftest: bump huancun to make vcs happy
* difftest,nemu: support run-time assign mem size
* ci: polish nightly ci script
show more ...
|
#
b6900d94 |
| 28-Apr-2022 |
Yinan Xu <[email protected]> |
core,rob: support the WFI instruction
The RISC-V WFI instruction is previously decoded as NOP. This commit adds support for the real wait-for-interrupt (WFI).
We add a state_wfi FSM in the ROB. Aft
core,rob: support the WFI instruction
The RISC-V WFI instruction is previously decoded as NOP. This commit adds support for the real wait-for-interrupt (WFI).
We add a state_wfi FSM in the ROB. After WFI leaves the ROB, the next instruction will wait in the ROB until an interrupt.
show more ...
|
#
2e1be6e1 |
| 14-Feb-2022 |
Steve Gou <[email protected]> |
ctrl,ftq: move pc and target calculation in redirect generator to ftq (#1463)
|
#
d7dd1af1 |
| 05-Jan-2022 |
Li Qianruo <[email protected]> |
Debug mode: various bug fixes (#1412)
* Reduce trigger hit wires that goes into exceptiongen
* Fix frontend triggers rewriting hit wire
* Retrieved some accidentally dropped changes in branch dm-d
Debug mode: various bug fixes (#1412)
* Reduce trigger hit wires that goes into exceptiongen
* Fix frontend triggers rewriting hit wire
* Retrieved some accidentally dropped changes in branch dm-debug (mainly fixes to debug mode)
* Fix dmode in tdata1
* Fix ebreaks not causing exception in debug mode
* Fix dcsr field bugs
* Fix faulty distributed tEnable
* Fix store triggers not using vaddr
* Fix store trigger rewriting hit vector
* Initialize distributed tdata registers in MemBlock and Frontend to zero
* Fix load trigger select bit in mcontrol
* Fix singlestep bit valid in debug mode
* Mask all interrupts in debug mode
show more ...
|
#
df5b4b8e |
| 18-Dec-2021 |
Yinan Xu <[email protected]> |
csr: optimize exception and trapTarget timing (#1372)
|
#
a1351e5d |
| 16-Dec-2021 |
Jay <[email protected]> |
Fix false hit bug after IFU timing optimization (#1367)
* fix invalidTakenFault use wrong seqTarget
* IFU: fix oversize bug
* ctrl: mark all flushes as level.flush for frontend
This commit
Fix false hit bug after IFU timing optimization (#1367)
* fix invalidTakenFault use wrong seqTarget
* IFU: fix oversize bug
* ctrl: mark all flushes as level.flush for frontend
This commit changes how flushes behave for frontend.
When ROB commits an instruction with a flush, we notify the frontend
of the flush without the commit.
Flushes to frontend may be delayed by some cycles and commit before
flush causes errors. Thus, we make all flush reasons to behave the
same as exceptions for frontend, that is, RedirectLevel.flush.
* IFU: exclude lastTaken situation when judging beyond fetch
Co-authored-by: Yinan Xu <[email protected]>
show more ...
|
#
fd7603d9 |
| 15-Dec-2021 |
Yinan Xu <[email protected]> |
rename: add fused lui and load (#1356)
This commit adds fused load support by bypassing LUI results to load.
For better timing, detection is done at the rename stage. Imm is stored
in psrc(1), p
rename: add fused lui and load (#1356)
This commit adds fused load support by bypassing LUI results to load.
For better timing, detection is done at the rename stage. Imm is stored
in psrc(1), psrc(0) and imm.
show more ...
|
#
6f688dac |
| 11-Dec-2021 |
Yinan Xu <[email protected]> |
core: delay csrCtrl for two cycles (#1336)
This commit adds DelayN(2) to some CSR-related signals, including
control bits to ITLB, DTLB, PTW, etc.
To avoid accessing the ITLB before control bits
core: delay csrCtrl for two cycles (#1336)
This commit adds DelayN(2) to some CSR-related signals, including
control bits to ITLB, DTLB, PTW, etc.
To avoid accessing the ITLB before control bits change, we also need
to delay the flush for two cycles. We assume branch misprediction or
memory violation does not cause csrCtrl to change.
show more ...
|
#
1ca0e4f3 |
| 10-Dec-2021 |
Yinan Xu <[email protected]> |
core: refactor hardware performance counters (#1335)
This commit optimizes the coding style and timing for hardware
performance counters.
By default, performance counters are RegNext(RegNext(_)).
|
#
6ab6918f |
| 09-Dec-2021 |
Yinan Xu <[email protected]> |
core: refactor writeback parameters (#1327)
This commit adds WritebackSink and WritebackSource parameters for
multiple modules. These traits hide implementation details from
other modules by defin
core: refactor writeback parameters (#1327)
This commit adds WritebackSink and WritebackSource parameters for
multiple modules. These traits hide implementation details from
other modules by defining IO-related functions in modules.
By using WritebackSink, ROB is able to choose the writeback sources.
Now fflags and exceptions are connected from exe units to reduce write
ports and optimize timing.
Further optimizations on write-back to RS and better coding style to
be added later.
show more ...
|
#
d6477c69 |
| 05-Dec-2021 |
Yinan Xu <[email protected]> |
wb,load: delay load fp for one cycle (#1296)
|
#
980c1bc3 |
| 23-Nov-2021 |
William Wang <[email protected]> |
mem,mdp: use robIdx instead of sqIdx (#1242)
* mdp: implement SSIT with sram
* mdp: use robIdx instead of sqIdx
Dispatch refactor moves lsq enq to dispatch2, as a result, mdp can not
get corr
mem,mdp: use robIdx instead of sqIdx (#1242)
* mdp: implement SSIT with sram
* mdp: use robIdx instead of sqIdx
Dispatch refactor moves lsq enq to dispatch2, as a result, mdp can not
get correct sqIdx in dispatch. Unlike robIdx, it is hard to maintain a
"speculatively assigned" sqIdx, as it is hard to track store insts in
dispatch queue. Yet we can still use "speculatively assigned" robIdx
for memory dependency predictor.
For now, memory dependency predictor uses "speculatively assigned"
robIdx to track inflight store.
However, sqIdx is still used to track those store which's addr is valid
but data it not valid. When load insts try to get forward data from
those store, load insts will get that store's sqIdx and wait in RS.
They will not waken until store data with that sqIdx is issued.
* mdp: add track robIdx recover logic
show more ...
|
#
5668a921 |
| 16-Nov-2021 |
Jiawei Lin <[email protected]> |
Fix multi-core dedup bug (#1235)
* FDivSqrt: use hierarchy API to avoid dedup bug
* Dedup: use hartId from io port instead of core parameters
* Bump fudian
|
#
7057cff8 |
| 24-Oct-2021 |
Yinan Xu <[email protected]> |
lsq: enqueue at dispatch2 stage (#1167)
This commit changes when instructions enter load/store queue.
Now, at dispatch2, load/store instructions enter load/store queue.
|