Parameters.scala - OpenGrok history log for /XiangShan/src/main/scala/xiangshan/Parameters.scala

Revision	Date	Author	Comments
# 1e7e38e2	22-Apr-2025	Anzo <[email protected]>	chore(Parameters): remove the incorrect parameter description (#4391) This is a misrepresentation; currently, there is only one item in the fofbuffer.
# a25f1ac9	15-Apr-2025	Guanghui Cheng <[email protected]>	fix(trace): fix parameters of trace (#4561)
# 4a02bbda	15-Apr-2025	Anzo <[email protected]>	fix(LSU): misalign writeback aligned raw rollback (#4476) By convention, we need to make `rollback` and `writeback` happen at the same time, and not make `writeback` earlier than `rollback`. Curren fix(LSU): misalign writeback aligned raw rollback (#4476) By convention, we need to make `rollback` and `writeback` happen at the same time, and not make `writeback` earlier than `rollback`. Currently, the `rollback` generated by raw occurs at `s4`. A normal store would take an extra N beats after the end of s3 (based on the number of RAWQueue entries, which is now 1 beat), which is equivalent to `writeback` at `s4` And misaligned would `writeback` at `s2`, then `writeback` after switching to `s_wb` state, which is equivalent to `writeback` at `s3` --- This pr adjusts the misaligned `writeback` logic to align with the `StoreUnit`. At the same time, it unified the way to calculate the number of beats. show more ...
# 30f35717	14-Apr-2025	cz4e <[email protected]>	refactor(DFT): refactor `DFT` IO (#4530)
# 4c0658ae	04-Apr-2025	Tang Haojin <[email protected]>	feat(backend): make wfi timeout configurable (#4491)
# 602aa9f1	02-Apr-2025	cz4e <[email protected]>	feat(Sram): add `SRAM_CTL` interface (#4474) * add `SRAM_CTL` interface for SRAMTemplate * use `SRAM_WITH_CTL` to enable, e.g. `make sim-verilog CONFIG=KunminghuV2Config RELEASE=1 SRAM_WITH_CTL= feat(Sram): add `SRAM_CTL` interface (#4474) * add `SRAM_CTL` interface for SRAMTemplate * use `SRAM_WITH_CTL` to enable, e.g. `make sim-verilog CONFIG=KunminghuV2Config RELEASE=1 SRAM_WITH_CTL=1` show more ...
# eaf14747	06-Mar-2025	cz4e <[email protected]>	fix(LoadUnit): enable EnableAccurateLoadError (#4363)
# 4b2c87ba	27-Feb-2025	梁森 Liang Sen <[email protected]>	feat(dfx): integerate dfx components (#4312)
# a7904e27	24-Feb-2025	Anzo <[email protected]>	fix(StoreQueue): fix threshold condition for fore write sbuffer (#4306) Previously, `ForceWrite` was conditioned to write dead (60, 55), which no longer applies after we adjusted `StoreQueueSize`. fix(StoreQueue): fix threshold condition for fore write sbuffer (#4306) Previously, `ForceWrite` was conditioned to write dead (60, 55), which no longer applies after we adjusted `StoreQueueSize`. --- Now a more reasonable parameterized setting is used. However, the conditions for optimal performance still need to be tested. show more ...
# 8882eb68	21-Feb-2025	Xin Tian <[email protected]>	feat(bitmap/memenc): support memory isolation by bitmap checking and memory encrpty used SM4-XTS (#3980) - Add bitmap module in MMU for memory isolation - Add memory encryption module based on AXI p feat(bitmap/memenc): support memory isolation by bitmap checking and memory encrpty used SM4-XTS (#3980) - Add bitmap module in MMU for memory isolation - Add memory encryption module based on AXI protoco - Can don't using these modules by setting the option `HasMEMencryption` & `HasBitmapCheck` to false show more ...
# 914bbc86	20-Feb-2025	xiaofeibao-xjtu <[email protected]>	chore(dispatch): remove useless code and files (#4288)
# b1d76493	27-Jan-2025	Tang Haojin <[email protected]>	chore(Parameters): add zawrs, zihintntl and ziccamoa into isa string (#4219)
# 4ba1d457	26-Jan-2025	Kunlin You <[email protected]>	submodule(utility): introduce XSPerfLevel for performance counter (#4238) This change introduce XSPerfLevel, including `VERBOSE`/`NORMAL`/`CRITICAL`. Only counters with level greater or equal than t submodule(utility): introduce XSPerfLevel for performance counter (#4238) This change introduce XSPerfLevel, including `VERBOSE`/`NORMAL`/`CRITICAL`. Only counters with level greater or equal than threhold will be instantiated, which will reduce utilization and compile time on Pallaium. PerfLevel therhold can be set in command line, `VERBOSE` by default to apply all counters. An example usage as follows: SIM_ARGS="--perf-level CRITICAL" or PLDM_ARGS="--perf-level CRITICAL" PLDM=1 PerfLevel param is also `VERBOSE` by default, which means all counters will be ignored now if threhold greater than that. User can explicitly set params to keep some important counters instantiated, as follows: XSPerfAccumulate(xx, yy, perfLevel = XSPerfLevel.CRITICAL) show more ...
# 74050fc0	26-Jan-2025	Yanqin Li <[email protected]>	perf(Uncache): add merge policy when entering (#4154) # Background ## Problem How to design a more efficient entry rule for a new load/store request when a load/store with the same address already perf(Uncache): add merge policy when entering (#4154) # Background ## Problem How to design a more efficient entry rule for a new load/store request when a load/store with the same address already exists in the `ubuffer`？ * Old Design: Always reject the new request. * New Design: Consider merging requests. ## Merge Scenarios ‼️If the new one can be merge into the existing one, both need to be `NC`. 1. New Store Request: 1. Existing Store: Merge (the new store is younger). 2. Existing Load: Reject. 2. New Load Request: 1. Existing Load: Merge (the new load may be younger or older. Both are ok to merge). 2. Existing Store: Reject. # What this PR do? ## 1. Entry Actions 1. Allocate a new entry and mark as `valid` 1. When there is no matching address. 2. Allocate a new entry and mark as `valid` and `waitSame`: 1. When there is a matching address, and: * The virtual addresses and attributes are the same. * The older entry is either selected to issue or issued. 3. Merge into an Existing Entry: 1. When there is a matching address, and: * The virtual addresses and attributes are the same. * The older entry is not selected to issue or issued. 4. Reject the New Request: 1. When the ubuffer is full. 2. When there is a matching address, but: * The virtual addresses or attributes are different. NOTE: According to the definition in the TL-UL SPEC, the `mask` must be continuous and naturally aligned, and the `addr` must correspond to the mask. Therefore, the "same attributes" here introduces a new condition: the merged `mask` must meet the requirements of being continuous and naturally aligned (function `continueAndAlign`). During merging, the block offset of addr must be synchronously updated in `UncacheEntry.update`. ## 2. Handshake Mechanism Between `LoadQueueUncache (M)` and `Uncache (S)` > `mid`: master id > > `sid`: slave id Old Design: - `M` sends a `req` with a `mid`. - `S` receives the `req`, records the `mid`. - `S` sends a `resp` with the `mid`. - `M` receives the `resp` and matches it with the recorded `mid`. New Design: - `M` sends a `req` with a `mid`. - `S` receives the `req` and responds with `{mid, sid}` . - `M` matches it with the `mid` and updates its record with the received `sid`. - `S` sends a `resp` with the its `sid`. - `M` receives the `resp` and matches it with the recorded `sid`. Benefit: The new design allows `S` to merge requests when new request enters. ## 3. Forwarding Mechanism Old Design: Each address in the `ubuffer` is unique, so forwarding is straightforward based on a match. New Design: * A single address may have up to two entries matched in the `ubuffer`. * If it has two matched enties, it must be true that one entry is marked `inflight` and the other entry is marked `waitSame`. In this case, the forwarded data comes from the merged data of two entries, with the `inflight` entry being the older one. ## 4. Bug Fixes 1. In the `loadUnit`, `!tlbMiss` cannot be directly used as `tlbHit`, because when `tlbValid` is false, `!tlbMiss` can still be true. 2. `Uncache` state machine transition: The state indicating "able to send requests" (previously `s_refill_req`, now `s_inflight`) should not be triggered by `reqFire` but rather by `acquireFire`. <img width="747" alt="image" src="https://github.com/user-attachments/assets/75fbc761-1da8-43d9-a0e6-615cc58cefef" /> # Evaluation - ✅ timing - ✅ performance \| Type \| 4B1000 \| Speedup1-IO \| 1B4096 \| Speedup2-IO \| \| -------------- \| ------- \| ----------- \| ------- \| ----------- \| \| IO \| 51026 \| 1 \| 208149 \| 1.00 \| \| NC \| 42343 \| 1.21 \| 169248 \| 1.23 \| \| NC+OT \| 20379 \| 2.50 \| 160101 \| 1.30 \| \| NC+OT+mergeOpt \| 16308 \| 3.13 \| 126369 \| 1.65 \| \| cache \| 1298 \| 39.31 \| 4410 \| 47.20 \| show more ...
# 5bd65c56	14-Jan-2025	Tang Haojin <[email protected]>	feat(Config): add yaml parser for complicated parametrization (#4147) This commit enables complicated parameterization by yaml parsing. We use circe to do this. In this commit, we implement 6 confi feat(Config): add yaml parser for complicated parametrization (#4147) This commit enables complicated parameterization by yaml parsing. We use circe to do this. In this commit, we implement 6 configurations: - PmemRanges: physical memory ranges - PMAConfigs - CHIAsyncBridge: set depth to 0 to disable it - L2CacheConfig - L3CacheConfig - DebugModuleBaseAddr For better human-readability, this commit changes `WithNKBL2/3` to `L2/3CacheConfig`, changing to case classes, and making the first parameter only accept human-readable size configuration like `0.5 MB` or `256kB`. This commit also changes PMAConfigs and PmemRanges into List of case classes. show more ...
# 552d2d4e	06-Jan-2025	Tang Haojin <[email protected]>	chore(Parameters): add svnapot extension string (#4133)
# 6c106319	30-Dec-2024	xu_zh <[email protected]>	feat(ICache): ECC error injection (#4044) This PR is part of RAS(Reliability, Accessibility, Serviceability) error recovery features. - Add a series of mmio-mapped CSR to control ICache ECC check feat(ICache): ECC error injection (#4044) This PR is part of RAS(Reliability, Accessibility, Serviceability) error recovery features. - Add a series of mmio-mapped CSR to control ICache ECC check & ECC inject features - Implement ICache ECC injection - M-state software can write `eccctrl` to trigger error injection to meta/dataArray, next read can trigger auto-recovery (implemented in #3899) - Remove custom CSR `Sfetchctl` # Details ## CSR The base address of the added mmio-mapped CSR is `0x38022080` and the registers is defined as below: ``` 64 10 7 4 2 1 0 0x00 eccctrl \| WARL \| ierror \| istatus \| itarget \| inject \| enable \| 64 PAddrBits-1 0 0x08 ecciaddr \| WARL \| paddr \| ``` \| CSR \| field \| desp \| \| --- \| --- \| --- \| \| eccctrl \| enable \| ECC check enable \| \| eccctrl \| inject \| ECC inject enable (write 1 to trigger injection, read always 0) \| \| eccctrl \| itarget \| ECC inject target<br>0: metaArray<br>1: rsvd<br>2: dataArray<br>3: rsvd \| \| eccctrl \| istatus \| ECC inject status (read-only)<br>0: idle: inject controller idle, goes to working when received a inject request (i.e. write 1 to eccctrl.inject)<br>1: working: inject controller working, goes to injected when finished / error when failed<br>2: injected, goes to idle after read<br>3: rsvd<br>4: rsvd<br>5: rsvd<br>6: rsvd<br>7: error: inject failed (check eccctl.ierror for reason), goes to idle after read \| \| eccctrl \| ierror \| ECC error reason (read-only, valid only if `eccctrl.istatus==error`)<br>0: ECC check is not enabled (i.e. `!eccctrl.enable`)<br>1: inject target invalid (i.e. `eccctrl.itarget==rsvd`)<br>2: inject addr (i.e. `ecciaddr.paddr`) not in ICache<br>3: rsvd<br>4: rsvd<br>5: rsvd<br>6: rsvd<br>7: rsvd \| \| ecciaddr \| paddr \| Physical address of the inject target \| ## Inject method ```asm $INJECT_ADDR: # maybe do something else ret test: la t0, $BASE_ADDR # load icache control base addr la t1, $INJECT_ADDR # load inject addr jalr ra, 0(t1) # jump to injected addr to load it i sd t1, 8(t0) # set inject addr la t2, (target << 2 \| 1 << 1 \| 1 << 0) # load inject target & inject enable & ecc enable sd t1, 0(t0) # set inject enable & ecc enable loop: ld t1, 0(t0) # get ecc control state andi t1, t1, (0b11 << (4+1)) # get high bits of inject state beqz t1, loop # if is idle, or working, loop addi t1, t1, -1 # t1 = inject_state[2:1] - 1 bnez t1, error # if is not injected, error or rsvd jalr ra, 0(t1) # jump to injected addr to trigger error j finish error: # handle error finish: # finish ``` Or, checkout https://github.com/OpenXiangShan/nexus-am/pull/48 show more ...
# 17386530	25-Dec-2024	Anzo <[email protected]>	fix(LoadQueueRAR): aligning the size of `RARSize` to `VLQSize` (#4086) For vectors, if the size of the `RAR` is not equal to the `VLQ`, it can lead to a jam. This is because: A `uop` of a vector sp fix(LoadQueueRAR): aligning the size of `RARSize` to `VLQSize` (#4086) For vectors, if the size of the `RAR` is not equal to the `VLQ`, it can lead to a jam. This is because: A `uop` of a vector splits into multiple Load operations and occupies multiple `VLQ`. The release condition of the `RAR` is that the `VLQ` are dequeue, whereas the vector requires that all Load operations of the `uop` are written back to the `MergeBuffer` before the `VLQ` can be let out of the queue. Therefore, it may happen that when the deqptr of `VLQ` waits for vector `uop` to write back all, but the `RAR` is already full, the Load operation split by vector `uop` can not enter the `RAR`, so it will wait in the `ReplayQueue` for the `RAR` to be non-full, and the `RAR` can not be released because `VLQ` can not get dequeue, which leads to deadlock. show more ...
# 452b5843	19-Dec-2024	Huijin Li <[email protected]>	power(MemBlock): power optimization in MemBlock (#4059) power optimization： (1) use “withClockGate” instead of ClockGate in DCache (2) reduce LSQ entries
# 4e7f9e52	16-Dec-2024	xiaofeibao <[email protected]>	fix(dispatch): fix bug of index vld instr, each uop can be index vld instr
# 49f2b250	10-Dec-2024	xiaofeibao <[email protected]>	timing(backend): each IQ has at least two simple entries
# c22ffc80	29-Nov-2024	xiaofeibao <[email protected]>	area(backend): reduce a vfcvt for better area
# 0d50d631	29-Nov-2024	xiaofeibao <[email protected]>	Revert "area(Backend): reduce VfScheduler iq num from 3 to 2 and remove a vfcvt fu" This reverts commit 10b44fa68ead2a8d79ce215b6bb116912f72f3a4.
# 11cc8561	26-Nov-2024	xiaofeibao <[email protected]>	area(Backend): reduce VfScheduler iq num from 3 to 2 and remove a vfcvt fu
# 8c6ac5eb	22-Nov-2024	xiaofeibao <[email protected]>	area(backend): reduce 4 fexu to 3 fexu
12 3 4 5 6 7 8 9 10 >>...19