1*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 2*9880d681SAndroid Build Coastguard Worker 3*9880d681SAndroid Build Coastguard WorkerCommon register allocation / spilling problem: 4*9880d681SAndroid Build Coastguard Worker 5*9880d681SAndroid Build Coastguard Worker mul lr, r4, lr 6*9880d681SAndroid Build Coastguard Worker str lr, [sp, #+52] 7*9880d681SAndroid Build Coastguard Worker ldr lr, [r1, #+32] 8*9880d681SAndroid Build Coastguard Worker sxth r3, r3 9*9880d681SAndroid Build Coastguard Worker ldr r4, [sp, #+52] 10*9880d681SAndroid Build Coastguard Worker mla r4, r3, lr, r4 11*9880d681SAndroid Build Coastguard Worker 12*9880d681SAndroid Build Coastguard Workercan be: 13*9880d681SAndroid Build Coastguard Worker 14*9880d681SAndroid Build Coastguard Worker mul lr, r4, lr 15*9880d681SAndroid Build Coastguard Worker mov r4, lr 16*9880d681SAndroid Build Coastguard Worker str lr, [sp, #+52] 17*9880d681SAndroid Build Coastguard Worker ldr lr, [r1, #+32] 18*9880d681SAndroid Build Coastguard Worker sxth r3, r3 19*9880d681SAndroid Build Coastguard Worker mla r4, r3, lr, r4 20*9880d681SAndroid Build Coastguard Worker 21*9880d681SAndroid Build Coastguard Workerand then "merge" mul and mov: 22*9880d681SAndroid Build Coastguard Worker 23*9880d681SAndroid Build Coastguard Worker mul r4, r4, lr 24*9880d681SAndroid Build Coastguard Worker str r4, [sp, #+52] 25*9880d681SAndroid Build Coastguard Worker ldr lr, [r1, #+32] 26*9880d681SAndroid Build Coastguard Worker sxth r3, r3 27*9880d681SAndroid Build Coastguard Worker mla r4, r3, lr, r4 28*9880d681SAndroid Build Coastguard Worker 29*9880d681SAndroid Build Coastguard WorkerIt also increase the likelihood the store may become dead. 30*9880d681SAndroid Build Coastguard Worker 31*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 32*9880d681SAndroid Build Coastguard Worker 33*9880d681SAndroid Build Coastguard Workerbb27 ... 34*9880d681SAndroid Build Coastguard Worker ... 35*9880d681SAndroid Build Coastguard Worker %reg1037 = ADDri %reg1039, 1 36*9880d681SAndroid Build Coastguard Worker %reg1038 = ADDrs %reg1032, %reg1039, %NOREG, 10 37*9880d681SAndroid Build Coastguard Worker Successors according to CFG: 0x8b03bf0 (#5) 38*9880d681SAndroid Build Coastguard Worker 39*9880d681SAndroid Build Coastguard Workerbb76 (0x8b03bf0, LLVM BB @0x8b032d0, ID#5): 40*9880d681SAndroid Build Coastguard Worker Predecessors according to CFG: 0x8b0c5f0 (#3) 0x8b0a7c0 (#4) 41*9880d681SAndroid Build Coastguard Worker %reg1039 = PHI %reg1070, mbb<bb76.outer,0x8b0c5f0>, %reg1037, mbb<bb27,0x8b0a7c0> 42*9880d681SAndroid Build Coastguard Worker 43*9880d681SAndroid Build Coastguard WorkerNote ADDri is not a two-address instruction. However, its result %reg1037 is an 44*9880d681SAndroid Build Coastguard Workeroperand of the PHI node in bb76 and its operand %reg1039 is the result of the 45*9880d681SAndroid Build Coastguard WorkerPHI node. We should treat it as a two-address code and make sure the ADDri is 46*9880d681SAndroid Build Coastguard Workerscheduled after any node that reads %reg1039. 47*9880d681SAndroid Build Coastguard Worker 48*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 49*9880d681SAndroid Build Coastguard Worker 50*9880d681SAndroid Build Coastguard WorkerUse local info (i.e. register scavenger) to assign it a free register to allow 51*9880d681SAndroid Build Coastguard Workerreuse: 52*9880d681SAndroid Build Coastguard Worker ldr r3, [sp, #+4] 53*9880d681SAndroid Build Coastguard Worker add r3, r3, #3 54*9880d681SAndroid Build Coastguard Worker ldr r2, [sp, #+8] 55*9880d681SAndroid Build Coastguard Worker add r2, r2, #2 56*9880d681SAndroid Build Coastguard Worker ldr r1, [sp, #+4] <== 57*9880d681SAndroid Build Coastguard Worker add r1, r1, #1 58*9880d681SAndroid Build Coastguard Worker ldr r0, [sp, #+4] 59*9880d681SAndroid Build Coastguard Worker add r0, r0, #2 60*9880d681SAndroid Build Coastguard Worker 61*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 62*9880d681SAndroid Build Coastguard Worker 63*9880d681SAndroid Build Coastguard WorkerLLVM aggressively lift CSE out of loop. Sometimes this can be negative side- 64*9880d681SAndroid Build Coastguard Workereffects: 65*9880d681SAndroid Build Coastguard Worker 66*9880d681SAndroid Build Coastguard WorkerR1 = X + 4 67*9880d681SAndroid Build Coastguard WorkerR2 = X + 7 68*9880d681SAndroid Build Coastguard WorkerR3 = X + 15 69*9880d681SAndroid Build Coastguard Worker 70*9880d681SAndroid Build Coastguard Workerloop: 71*9880d681SAndroid Build Coastguard Workerload [i + R1] 72*9880d681SAndroid Build Coastguard Worker... 73*9880d681SAndroid Build Coastguard Workerload [i + R2] 74*9880d681SAndroid Build Coastguard Worker... 75*9880d681SAndroid Build Coastguard Workerload [i + R3] 76*9880d681SAndroid Build Coastguard Worker 77*9880d681SAndroid Build Coastguard WorkerSuppose there is high register pressure, R1, R2, R3, can be spilled. We need 78*9880d681SAndroid Build Coastguard Workerto implement proper re-materialization to handle this: 79*9880d681SAndroid Build Coastguard Worker 80*9880d681SAndroid Build Coastguard WorkerR1 = X + 4 81*9880d681SAndroid Build Coastguard WorkerR2 = X + 7 82*9880d681SAndroid Build Coastguard WorkerR3 = X + 15 83*9880d681SAndroid Build Coastguard Worker 84*9880d681SAndroid Build Coastguard Workerloop: 85*9880d681SAndroid Build Coastguard WorkerR1 = X + 4 @ re-materialized 86*9880d681SAndroid Build Coastguard Workerload [i + R1] 87*9880d681SAndroid Build Coastguard Worker... 88*9880d681SAndroid Build Coastguard WorkerR2 = X + 7 @ re-materialized 89*9880d681SAndroid Build Coastguard Workerload [i + R2] 90*9880d681SAndroid Build Coastguard Worker... 91*9880d681SAndroid Build Coastguard WorkerR3 = X + 15 @ re-materialized 92*9880d681SAndroid Build Coastguard Workerload [i + R3] 93*9880d681SAndroid Build Coastguard Worker 94*9880d681SAndroid Build Coastguard WorkerFurthermore, with re-association, we can enable sharing: 95*9880d681SAndroid Build Coastguard Worker 96*9880d681SAndroid Build Coastguard WorkerR1 = X + 4 97*9880d681SAndroid Build Coastguard WorkerR2 = X + 7 98*9880d681SAndroid Build Coastguard WorkerR3 = X + 15 99*9880d681SAndroid Build Coastguard Worker 100*9880d681SAndroid Build Coastguard Workerloop: 101*9880d681SAndroid Build Coastguard WorkerT = i + X 102*9880d681SAndroid Build Coastguard Workerload [T + 4] 103*9880d681SAndroid Build Coastguard Worker... 104*9880d681SAndroid Build Coastguard Workerload [T + 7] 105*9880d681SAndroid Build Coastguard Worker... 106*9880d681SAndroid Build Coastguard Workerload [T + 15] 107*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 108*9880d681SAndroid Build Coastguard Worker 109*9880d681SAndroid Build Coastguard WorkerIt's not always a good idea to choose rematerialization over spilling. If all 110*9880d681SAndroid Build Coastguard Workerthe load / store instructions would be folded then spilling is cheaper because 111*9880d681SAndroid Build Coastguard Workerit won't require new live intervals / registers. See 2003-05-31-LongShifts for 112*9880d681SAndroid Build Coastguard Workeran example. 113*9880d681SAndroid Build Coastguard Worker 114*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 115*9880d681SAndroid Build Coastguard Worker 116*9880d681SAndroid Build Coastguard WorkerWith a copying garbage collector, derived pointers must not be retained across 117*9880d681SAndroid Build Coastguard Workercollector safe points; the collector could move the objects and invalidate the 118*9880d681SAndroid Build Coastguard Workerderived pointer. This is bad enough in the first place, but safe points can 119*9880d681SAndroid Build Coastguard Workercrop up unpredictably. Consider: 120*9880d681SAndroid Build Coastguard Worker 121*9880d681SAndroid Build Coastguard Worker %array = load { i32, [0 x %obj] }** %array_addr 122*9880d681SAndroid Build Coastguard Worker %nth_el = getelementptr { i32, [0 x %obj] }* %array, i32 0, i32 %n 123*9880d681SAndroid Build Coastguard Worker %old = load %obj** %nth_el 124*9880d681SAndroid Build Coastguard Worker %z = div i64 %x, %y 125*9880d681SAndroid Build Coastguard Worker store %obj* %new, %obj** %nth_el 126*9880d681SAndroid Build Coastguard Worker 127*9880d681SAndroid Build Coastguard WorkerIf the i64 division is lowered to a libcall, then a safe point will (must) 128*9880d681SAndroid Build Coastguard Workerappear for the call site. If a collection occurs, %array and %nth_el no longer 129*9880d681SAndroid Build Coastguard Workerpoint into the correct object. 130*9880d681SAndroid Build Coastguard Worker 131*9880d681SAndroid Build Coastguard WorkerThe fix for this is to copy address calculations so that dependent pointers 132*9880d681SAndroid Build Coastguard Workerare never live across safe point boundaries. But the loads cannot be copied 133*9880d681SAndroid Build Coastguard Workerlike this if there was an intervening store, so may be hard to get right. 134*9880d681SAndroid Build Coastguard Worker 135*9880d681SAndroid Build Coastguard WorkerOnly a concurrent mutator can trigger a collection at the libcall safe point. 136*9880d681SAndroid Build Coastguard WorkerSo single-threaded programs do not have this requirement, even with a copying 137*9880d681SAndroid Build Coastguard Workercollector. Still, LLVM optimizations would probably undo a front-end's careful 138*9880d681SAndroid Build Coastguard Workerwork. 139*9880d681SAndroid Build Coastguard Worker 140*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 141*9880d681SAndroid Build Coastguard Worker 142*9880d681SAndroid Build Coastguard WorkerThe ocaml frametable structure supports liveness information. It would be good 143*9880d681SAndroid Build Coastguard Workerto support it. 144*9880d681SAndroid Build Coastguard Worker 145*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 146*9880d681SAndroid Build Coastguard Worker 147*9880d681SAndroid Build Coastguard WorkerThe FIXME in ComputeCommonTailLength in BranchFolding.cpp needs to be 148*9880d681SAndroid Build Coastguard Workerrevisited. The check is there to work around a misuse of directives in inline 149*9880d681SAndroid Build Coastguard Workerassembly. 150*9880d681SAndroid Build Coastguard Worker 151*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 152*9880d681SAndroid Build Coastguard Worker 153*9880d681SAndroid Build Coastguard WorkerIt would be good to detect collector/target compatibility instead of silently 154*9880d681SAndroid Build Coastguard Workerdoing the wrong thing. 155*9880d681SAndroid Build Coastguard Worker 156*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 157*9880d681SAndroid Build Coastguard Worker 158*9880d681SAndroid Build Coastguard WorkerIt would be really nice to be able to write patterns in .td files for copies, 159*9880d681SAndroid Build Coastguard Workerwhich would eliminate a bunch of explicit predicates on them (e.g. no side 160*9880d681SAndroid Build Coastguard Workereffects). Once this is in place, it would be even better to have tblgen 161*9880d681SAndroid Build Coastguard Workersynthesize the various copy insertion/inspection methods in TargetInstrInfo. 162*9880d681SAndroid Build Coastguard Worker 163*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 164*9880d681SAndroid Build Coastguard Worker 165*9880d681SAndroid Build Coastguard WorkerStack coloring improvements: 166*9880d681SAndroid Build Coastguard Worker 167*9880d681SAndroid Build Coastguard Worker1. Do proper LiveStackAnalysis on all stack objects including those which are 168*9880d681SAndroid Build Coastguard Worker not spill slots. 169*9880d681SAndroid Build Coastguard Worker2. Reorder objects to fill in gaps between objects. 170*9880d681SAndroid Build Coastguard Worker e.g. 4, 1, <gap>, 4, 1, 1, 1, <gap>, 4 => 4, 1, 1, 1, 1, 4, 4 171*9880d681SAndroid Build Coastguard Worker 172*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 173*9880d681SAndroid Build Coastguard Worker 174*9880d681SAndroid Build Coastguard WorkerThe scheduler should be able to sort nearby instructions by their address. For 175*9880d681SAndroid Build Coastguard Workerexample, in an expanded memset sequence it's not uncommon to see code like this: 176*9880d681SAndroid Build Coastguard Worker 177*9880d681SAndroid Build Coastguard Worker movl $0, 4(%rdi) 178*9880d681SAndroid Build Coastguard Worker movl $0, 8(%rdi) 179*9880d681SAndroid Build Coastguard Worker movl $0, 12(%rdi) 180*9880d681SAndroid Build Coastguard Worker movl $0, 0(%rdi) 181*9880d681SAndroid Build Coastguard Worker 182*9880d681SAndroid Build Coastguard WorkerEach of the stores is independent, and the scheduler is currently making an 183*9880d681SAndroid Build Coastguard Workerarbitrary decision about the order. 184*9880d681SAndroid Build Coastguard Worker 185*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===// 186*9880d681SAndroid Build Coastguard Worker 187*9880d681SAndroid Build Coastguard WorkerAnother opportunitiy in this code is that the $0 could be moved to a register: 188*9880d681SAndroid Build Coastguard Worker 189*9880d681SAndroid Build Coastguard Worker movl $0, 4(%rdi) 190*9880d681SAndroid Build Coastguard Worker movl $0, 8(%rdi) 191*9880d681SAndroid Build Coastguard Worker movl $0, 12(%rdi) 192*9880d681SAndroid Build Coastguard Worker movl $0, 0(%rdi) 193*9880d681SAndroid Build Coastguard Worker 194*9880d681SAndroid Build Coastguard WorkerThis would save substantial code size, especially for longer sequences like 195*9880d681SAndroid Build Coastguard Workerthis. It would be easy to have a rule telling isel to avoid matching MOV32mi 196*9880d681SAndroid Build Coastguard Workerif the immediate has more than some fixed number of uses. It's more involved 197*9880d681SAndroid Build Coastguard Workerto teach the register allocator how to do late folding to recover from 198*9880d681SAndroid Build Coastguard Workerexcessive register pressure. 199*9880d681SAndroid Build Coastguard Worker 200