Code optimization (Development MSX Fora)MSX Resource Center MSXdev 2008 - MSX1 development bonanza!           
            
English Nederlands Español Português Russian         
 Nieuws
   Voorpagina
  Nieuws archief
  Nieuws onderwerpen

 Informatie
   MSX Fora
  Artikelen
  Recensies
  Beursverslagen
  Fotoreportages
  Beurzen en meetings
  Enquêtes
  Links
  Zoek

 Software
   Downloads
  Webshop

 MRC
   Wie we zijn
  Kom bij ons team
  Doneren
  Policies
  Contact met het MRC
  Link naar Ons
  Statistieken

 Zoek
 
  

  

 Login
 

Gebruikersnaam

Wachtwoord




Ben je nog niet lid? Klik hier en word MSX vriend!


 Statistieken
 

Er zijn 137 gasten en 3 MSX vrienden online

Je bent een anonieme bezoeker.
 

MSX Fora


MSX Fora

Development - Code optimization

Ga naar pagina ( 1 | 2 | 3 Volgende pagina )
Schrijver

Code optimization

ARTRAG
msx master
Berichten: 1592
Geplaatst: 07 Februari 2008, 12:06   
It is a long time since someone mentioned ASM coding on MRC.

Time to repair this issue.
Can this code be optimized ?

The two functions are supposed to copy from and to a "room" of size map_w X map_h (height do not care)
the background under a frame taken from external data and store the tiles in a buffer.

The position in the room and the address of the buffer that stores the background are passed
as parameters in registers BC and DE.

The frame number is passed on the stack.

;de   source_addr;
;bc   dest_addr;
;ix+4 e ix+5 nframe
    global _npctgrab
_npctgrab:
    push    ix
    ld  ix,0
    add ix,sp

    push    de
    ld  e,(ix+4)
    ld  d,(ix+5)
    ld  hl,_frames
    add hl,de
    add hl,de
    ld  e,(hl)
    inc hl
    ld  d,(hl)  ; de punta alla frame corrente
    push    de
    pop     ix  ; ora ix punta alla frame corrente

    pop hl      ; hl punta alla source in room

    ld d,b      ; bc puntava alla destination in frame buffer
    ld e,c      ; ora de punta alla destination in frame buffer

1:  ld  a,(ix+0)      ; 127 == fine

    cp 127
    jp z,3f

    ld  c,a
    ld  b,0
    push hl
    add hl,bc       ; source

    ld  c,(ix+1)    ; len
    inc ix
    inc ix
    add ix,bc

    ldir

    pop hl

    ld  bc,(_map_w)
    add hl,bc

    jp  1b

3:  pop ix
    pop hl
    pop af
    jp  (hl)



;de   source_addr;
;bc   dest_addr;
;ix+4 e ix+5 nframe
    global _npctrest
_npctrest:
    push    ix
    ld  ix,0
    add ix,sp

    push    de
    ld  e,(ix+4)
    ld  d,(ix+5)
    ld  hl,_frames
    add hl,de
    add hl,de
    ld  e,(hl)
    inc hl
    ld  d,(hl)  ; de punta alla frame corrente
    push    de
    pop     ix  ; ora ix punta alla frame corrente

    pop hl      ; hl punta alla source in room

    ld d,b      ; bc puntava alla destination in frame buffer
    ld e,c      ; ora de punta alla destination in frame buffer

1:  ld  a,(ix+0)      ; 127 == fine

    cp 127
    jp z,3f

    ld  c,a
    ld  b,0
    push de
    ex de,hl
    add hl,bc       ; source
    ex de,hl

    ld  c,(ix+1)    ; len
    inc ix
    inc ix
    add ix,bc

    ldir
    pop de

    ld  bc,(_map_w)
    ex de,hl      
    add hl,bc
    ex de,hl  
    jp  1b

3:  pop ix
    pop hl
    pop af
    jp  (hl)



The frame data are structured like this



framex1:
db 0,2,18,19 ; X offset of line 0, length of line 0, data, data ect
db 0,2,20,21;  X offset of line 1, length of line 1, data, data ect
db 127         ; end of the frame

frame0:
db 5,1,147
db 4,1,147
db 3,1,147
db 2,1,147
db 1,1,147
db 127

etc

_frames:
    dw  framex1,frame0,frame1,frame2,frame3,frame4,frame5, etc etc

ARTRAG
msx master
Berichten: 1592
Geplaatst: 07 Februari 2008, 13:35   
(I mean optimized for speed naturally)
ro
msx guru
Berichten: 2320
Geplaatst: 07 Februari 2008, 15:48   
Well, using the index regs (IX and IY) are never clever tricks concering speed. They're slow. Using HL regs and doing some incs and decs will speed it up already. Make intellegent tabels so you don't have to inc/dec too many times.

Comparing the Accu with 127 using the CP method, like you do, can be done faster by using AND #7F, JP NZ,xxxx

just some thoughts...
Metalion
msx freak
Berichten: 215
Geplaatst: 07 Februari 2008, 15:49   
try not to use the IX register, it will increase speed

EDIT : posted at the same time as the previous message ...
Huey
msx professional
Berichten: 582
Geplaatst: 07 Februari 2008, 15:57   
Quote:

try not to use the IX register, it will increase speed



AFAIK the ASM code is called using Hitech-C. It puts parameters in IX register....

ARTRAG
msx master
Berichten: 1592
Geplaatst: 07 Februari 2008, 16:05   
@Huey
Not really, Hitech-C puts parameters on the stack before calling the function
and asks the called function to not modify the value of IX.

@ro and Metalion
I'd like to avoid IX and IY in the loop, but I do not know how, this is why I ask support

MicroTech
msx lover
Berichten: 109
Geplaatst: 07 Februari 2008, 16:28   
Quote:

I'd like to avoid IX and IY in the loop, but I do not know how, this is why I ask support


Do you have an equivalent C source?
Maybe it can be re-compiled with ASCII-C (which does not use index registers) and we can take inspiration from the resulting asm code.

ARTRAG
msx master
Berichten: 1592
Geplaatst: 07 Februari 2008, 16:42   
@MicroTech
No, this code is hand made and designed to be called by the Hitech-C compiler.
This affects only the way in which input parameters are passed and implies the need of restoring IX on exit

ARTRAG
msx master
Berichten: 1592
Geplaatst: 07 Februari 2008, 17:00   
Needless to say, I've tried to avoid the use of IX, but I do not see any real solution
jltursan
msx professional
Berichten: 847
Geplaatst: 07 Februari 2008, 17:10   
Optimization based on avoiding the use of index registers is a good idea; but there's no much iteration over these instructions. The biggest time is wasted in LDIR; so I think the best idea is to unroll the LDIR and repeat (height) times, (width) LDIs...at a size cost, of course

I've just remembered of a "Fast LDIR routine" posted somewhere in the forum, it was based on LDI of course; but with variable length, not custom as is now the case....
ARTRAG
msx master
Berichten: 1592
Geplaatst: 07 Februari 2008, 17:13   
Sadly to say, LDIR most part of the times moves 1 ore 2 bytes at time...
but this depends on the shape of the frame, so I cannot unroll it as I do not know the length of line X in advance
jltursan
msx professional
Berichten: 847
Geplaatst: 07 Februari 2008, 17:16   
Damn...
If it's no more than 2 bytes (I suposse not), maybe you can use alternate methods like a simple LD ($NNNN),HL. If it's definitely variable between 1-n (being possible n>2) then you must stuck with LDIs, I can't see any other way to transfer data, faster I mean.
Metalbrain
msx friend
Berichten: 15
Geplaatst: 07 Februari 2008, 17:20   
    push    de
    pop     ix  ; ora ix punta alla frame corrente


If you don't mind using the undocumented instructions, I think this is faster:

    ld ixh,d
    ld ixl,e

ARTRAG
msx master
Berichten: 1592
Geplaatst: 07 Februari 2008, 17:21   
All depends on the shape of the current frame, and in the game
it can vary a lot, from a simple NxM square (eg. a door) to a vine hanging from a tree (lots of lines with only one byte and different offsets).

ARTRAG
msx master
Berichten: 1592
Geplaatst: 07 Februari 2008, 17:28   
Quote:

    push    de
    pop     ix  ; ora ix punta alla frame corrente


If you don't mind using the undocumented instructions, I think this is faster:

    ld ixh,d
    ld ixl,e



push de ; 10 T states
pop IX ; 14 T states

total 24

ld ixh,d ; 8 T states
ld ixl,e; 8 T states

total 16
yes this is a saving but not what I was hoping (it is outside the inner loop, so almost negligible...)

anyway i'll change
    push    de
    pop     ix  ; ora ix punta alla frame corrente

to
    db 0xdd
    ld h,d
    db 0xdd
    ld l,e


 
Ga naar pagina ( 1 | 2 | 3 Volgende pagina )
 







(c) 1994 - 2008 Stichting MSX Resource Center. MSX is een trademark van MSX Licensing Corporation.