Crystal clean PCM 8bit samples on the poor PSG (Development MSX Fora)MSX Resource Center MSX Info Update - Finnish MSX madness at its best           
            
English Nederlands Español Português Russian         
 Nieuws
   Voorpagina
  Nieuws archief
  Nieuws onderwerpen

 Informatie
   MSX Fora
  Artikelen
  Recensies
  Beursverslagen
  Fotoreportages
  Beurzen en meetings
  Enquêtes
  Links
  Zoek

 Software
   Downloads
  Webshop

 MRC
   Wie we zijn
  Kom bij ons team
  Doneren
  Policies
  Contact met het MRC
  Link naar Ons
  Statistieken

 Zoek
 
  

  

 Login
 

Gebruikersnaam

Wachtwoord




Ben je nog niet lid? Klik hier en word MSX vriend!


 Statistieken
 

Er zijn 152 gasten en 0 MSX vrienden online

Je bent een anonieme bezoeker.
 

MSX Fora


MSX Fora

Development - Crystal clean PCM 8bit samples on the poor PSG

Ga naar pagina ( 1 | 2 | 3 | 4 | 5 Volgende pagina )
Schrijver

Crystal clean PCM 8bit samples on the poor PSG

ARTRAG
msx master
Berichten: 1495
Geplaatst: 29 December 2005, 13:09   
Ok this is time to share a nice routine for playing 8bit PCM samples on the PSG without the ugly spikes that are generated by the previous conventional techniques that play sample by sample each PCM sample using a 3 channels table.

Go to http://map.tni.nl/articles/psg_sample.php for details on the “standard” solution.

In a word, the standard solution associates each 8bit sample to a triplet of channel levels suitably chosen in order to minimize the error of the PSG overall output level with respect to the given sample. As the sum of the 3 channels leads to 608 disjoint values, you would expect a the output quality of a 9bit quantizer, but, unfortunately, the resulting levels are oddly spaced, so the theorical quality corresponds in dB to about 48dB of SNR, which is turn is equivalent to a 8bit uniform quantizer.

Problem:
The “standard” solution has a very annoying problem: the z80 cannot change the 3 PSG channels at the same time, this implies that, during the transition between two successive samples, the PSG passes trough undefined states where some channels are from the previous triplet and some channels are from the triplet corresponding to the next sample.

The result is that, at each transition between two successive samples, the PSG can produce 1 or 2 spikes of short duration. This effect, repeated at each sample transition, becomes a very annoying wideband noise that sensibly reduce the overall quality of the output, far below the optimal 48dB of SNR.

Solution:
The sole solution is to control the PSG trough every transition.

As the z80 can change a single channel at time and each channel has 16 values, the PSG can be modelled at any time as a state machine with 256 states, 16 inputs, and 608 outputs.
The 256 states are the 16*16 values of the old channels (those that don’t change at the current time), the 16 inputs are the allowed values of the current channel, the 608 outputs are the sum of the volume values from the “state” and of the volume value from the current input.
Each sample transition corresponds to three steps of the state machine.
Assume ABC the current values playing and A’B’C’ the new values.
The transitions are:

ABC -> ABC’
ABC’-> AB’C’
AB’C’-> A’B’C’

So, between the level ABC corresponding to the previous sample and the level A’B’C’ corresponding to the next sample, we get :
ABC
ABC’
AB’C’
A’B’C’

Assume x be the sequence of input samples, we want to minimize the mean square error between the x and the output sequence of levels in any single transition, this implies we need to develop a search algorithm that explores all the sequences of state transitions of the PSG corresponding to the allowed outputs, and finds the best sequence of channel levels that minimize the mean square error.

The algorithm is already existing, and is the classical Viterbi algorithm, I do not run into details here, but in case I will go back on this classical topic.

The output of such an algorithm is the sequence of levels for each channel that encodes the input sequence x.

Now the new problem: each triplet is 4*3= 12 bit : a huge waste of space!
Solution run length encoding.
The best solution should be a RLE based on nibbles, as some "detail channels" can have very long bursts of non equal symbols. Due to lack of time and of energy I did as you can see in the following.
I know that the implementation could be improved, but I cannot fit any a better solution in 447 t cycles (8KHz of PCM player).
Enjoy

ARTRAG
msx master
Berichten: 1495
Geplaatst: 29 December 2005, 13:10   

Fs          equ 8192  ;     alternative 8000 or 8192

;------------------
     MACRO  PsgW reg,value
            ld a,reg    ; select AY channel A volume register ; 7+1
            out (#A0),a                                       ; 11+1
            ld  a,value                                       ; 7+1
            out (#A1),a ; and write back to the AY            ; 11+1
            ENDM                                              ; total 36+4 = 40T
;------------------

        MACRO WaitXX
            if (Fs==8000)
                .3 nop          ; 3*5 = 15
            else
            if  (Fs==8192)
                nop             ; 5
            else
                assert 0
            endif
            endif
            ENDM            ; total 15 or 5
            
        MACRO Wait68 
            .2 ex (sp),hl   ; 2*20 =40
            .4 nop          ; 4*5 = 20
            cp 0            ; 8    
            ENDM            ; total 68

;------------------

        MACRO next  reg
            ld  a,(hl)      ; 8
            .4 rrca         ; 4*5
            and 15          ; 8
            ld  reg,a       ; 5
            ld  a,(hl)      ; 8
            and 15          ; 8
            exx             ; 5
            ld  reg,a       ; 5
            exx             ; 5
            inc hl          ; 7
            ENDM            ; tot = 79

;------------------

        OUTPUT wavplay.com
       
        ORG 100h

START:
        di
        PsgW 0,255
        PsgW 1,255
        PsgW 2,255
        PsgW 3,255
        PsgW 4,255
        PsgW 5,255
        PsgW 6,255
        PsgW 7,10111111B


;-------------------------------------
; IN   HL - Encoded sample start address
;      HL'- Sample length (#pcm saples)
;-------------------------------------

        ld  hl, SAMPLE+2
        ld  bc,#0101
        ld  de,bc
        exx
        ld  hl, (SAMPLE)
        ld  c,#A1           
        

.LOOP:      
        exx             ; 5
        dec b           ; 5
        ld  a,15         ; 8
        and b            ; 5
        jp nz,waitA     ; 11 
        next  b         ; 79
1       dec d           ; 5
        ld  a,15         ; 8
        and d            ; 5
        jp nz,waitB     ; 11 
        next  d         ; 79
2       dec e           ; 5
        ld  a,15         ; 8
        and e            ; 5
        jp nz,waitC     ; 11 
        next  e         ; 79
3       exx             ; 5
        
        ld  a,8         ; 8
        out (#a0),a     ; 12
        inc a           ; 5
        out (c),b       ; 14
        out (#a0),a     ; 12
        inc a           ; 5
        out (c),d       ; 14
        out (#a0),a     ; 12
        out (c),e       ; 14
            
        WaitXX          ; 15 or 5
        
        dec hl          ; 7
        ld  a,h         ; 5
        or  l           ; 5
        jp  nz,.LOOP    ; 11
                        ; tot  = 447T or 437T
        ret

waitA   Wait68          ; 68
        jp   1B         ; 11
                        ; tot = 79

waitB   Wait68          ; 68
        jp   2B         ; 11
                        ; tot = 79
                        
waitC   Wait68          ; 68
        jp   3B         ; 11
                        ; tot = 79


;       b7b6b5b4|b3b2b1b0
;       NumTick |LevVolum


SAMPLE:
        include "C:\MATLAB6p5\work\out.txt"

FINISH:

ARTRAG
msx master
Berichten: 1495
Geplaatst: 29 December 2005, 13:14   
To whom is interested in going deeper in the way the Viterbi optimization is performed

[y,Fs,nbit,opt]=wavread('bluetooth_input.wav');

%load handel

y = (y-min(y)) /(max(y)-min(y)) * 1.15;                % input tra 0 e 1.15

f = 3579545;
fpcm = Fs;

tp = f/fpcm; % 447 = 8000Hz, 437 = 8192Hz
dt1 = (12+5+14)/tp;
dt2 = (12+14)/tp;
dt3 = 1-dt1-dt2;

N = 3*length(y);

x = zeros(1,N);
x(1:3:end) = y;
x(2:3:end-3) = y(1:end-1)*(1-dt1)+y(2:end)*dt1;
x(3:3:end-3) = y(1:end-1)*(1-dt1-dt2)+y(2:end)*(dt1+dt2);
x(end-2:end)= y(end);

n=[0:15]';
vol=2.^(n/2)/2^7.5;
vol(1)=0;

nxtS = uint8(zeros(256,16));
for i=0:15
    for j=0:15
        for in=0:15
            nxtS(i*16+j+1,in+1) = uint8(j*16 + in);
        end
    end
end

curV = zeros(256,16);
for i=0:15
    for j=0:15
        for in=0:15
            curV(i*16+j+1,in+1) = vol(i+1)+vol(j+1)+vol(in+1);
        end
    end
end    

Stt = uint8(zeros(256,N));
Itt = uint8(zeros(256,N));

L  = zeros(256,1);
St = uint8(ones(256,1));
It = uint8(ones(256,1));

for t =1:N
    Ln = inf*ones(256,1);
    if mod(t-1,1000)==0
        proc = fix(t/N *100)
    end
    for cs=0:255
        for in=0:15
            cv = curV(cs+1,in+1);
            ns = double(nxtS(cs+1,in+1));

            switch mod(t-1,3)
                case 0
                    Ltst = L(cs+1)+dt1*abs(x(t)-cv)^2;
                case 1
                    Ltst = L(cs+1)+dt2*abs(x(t)-cv)^2;
                case 2
                    Ltst = L(cs+1)+dt3*abs(x(t)-cv)^2;
            end

            if  Ln(ns+1) >= Ltst
                Ln(ns+1) = Ltst;
                St(ns+1) = cs;
                It(ns+1) = in;
            end
        end
    end
    L = Ln;
    Stt(:,t) = St;
    Itt(:,t) = It;
end

[l,i] = min(L);

P = uint8(zeros(1,N));
I = uint8(zeros(1,N));

P(N) = Stt(i,N);
I(N) = Itt(i,N);
for t = (N-1):-1:1
   P(t) = Stt(double(P(t+1))+1,t);
   I(t) = Itt(double(P(t+1))+1,t);
end

V = zeros(1,N);
for t = 1:N
   V(t) = curV(double(P(t))+1,double(I(t))+1); 
end

er = sum(abs(x(1:3:end)-V(1:3:end)).^2)*dt1+...
     sum(abs(x(2:3:end)-V(2:3:end)).^2)*dt2+...
     sum(abs(x(3:3:end)-V(3:3:end)).^2)*dt3

en = sum(abs(x(1:3:end)).^2)*dt1+...
     sum(abs(x(2:3:end)).^2)*dt2+...
     sum(abs(x(3:3:end)).^2)*dt3

disp('SNR db=')
SNR = 10*log10(en/er)



j=1; 
A = 0;  
B = 0;  
C = 0;
la = 1; 
lb = 1; 
lc = 1;
ja=1;   
jb=2;   
jc=2;
s = [];

for i=1: length (I)
    switch mod(i-1,3)
        case 0
            if A==I(i)
                la=la+1;
                if la==17
                    s(ja)=16*16+double(A);
                    la=1;
                    ja=j;
                    j=j+1;
                end
            else
                s(ja)=16*la+double(A);
                la=1;
                ja=j;
                j=j+1;
                A=I(i);
            end
        case 1
            if B==I(i)
                lb=lb+1;
                if lb==17
                    s(jb)=16*16+double(B);
                    lb=1;
                    jb=j;
                    j=j+1;
                end
            else
                s(jb)=16*lb+double(B);
                lb=1;
                jb=j;
                j=j+1;
                B=I(i);
            end
        case 2
            if C==I(i)
                lc=lc+1;
                if lc==17
                    s(jc)=16*16+double(C);
                    lc=1;
                    jc=j;
                    j=j+1;
                end
            else
                s(jc)=16*lc+double(C);
                lc=1;
                jc=j;
                j=j+1;
                C=I(i);
            end
    end
    
end

disp('actual len=')
disp(length(s))

I0 = I(1:3:end);
I1 = I(2:3:end);
I2 = I(3:3:end);

i0 = find(diff(double(I0)));
str0=[];
for j=0:(length(i0)-1)
    if j==0
        t = I0(1:i0(j+1));
    else
        t = I0(i0(j)+1:i0(j+1));
    end
    str0(j+1,1) = length(t);
    str0(j+1,2) = I0(i0(j+1));
end

i1 = find(diff(double(I1)));
str1=[];
for j=0:(length(i1)-1)
    if j==0
        t = I1(1:i1(j+1));
    else
        t = I1(i1(j)+1:i1(j+1));
    end
    str1(j+1,1) = length(t);
    str1(j+1,2) = I1(i1(j+1));
end

i2 = find(diff(double(I2)));
str2=[];
for j=0:(length(i2)-1)
    if j==0
        t = I2(1:i2(j+1));
    else
        t = I2(i2(j)+1:i2(j+1));
    end
    str2(j+1,1) = length(t);
    str2(j+1,2) = I2(i2(j+1));
end
disp('best len=')
disp(length(str0)+length(str1)+length(str2))
disp ('ratio =')
disp(length(s)/(length(str0)+length(str1)+length(str2)))

% i=20 ; [  bitand(s(1:i)',240)/16 bitand(s(1:i)',15) ]
% [ str0(1:10,:) str1(1:10,:) str2(1:10,:) ]


t = zeros(N,1);

t(1:3:end) = 0   +       [1:N/3];
t(2:3:end) = dt1 +       [1:N/3];
t(3:3:end) = dt1 + dt2 + [1:N/3];

figure(1)
subplot(2,1,1)
plot(t,V,t,x,'r',t,x,'.g')
subplot(2,1,2)
plot(t,V-x)

np = fix(sum(fix(s/16))/3)-6;

[fid, MESSAGE] = fopen('out.txt','W');
fprintf(fid,'   dw    %6d\n',np);
fprintf(fid,'   db    %3d\n',s);
fclose(fid);

% ASM replayer simulator

Vp=[];t=1;
ChA=0;ChB=0;ChC=0;

b=1;d=1;e=1;
hl=1;
hlp = np;
while (hlp>0)
	b=b-1;
	if b==0
        b = fix(s(hl)/16);
        bp = bitand(s(hl),15);
        hl=hl+1;
	end
	d=d-1;
	if  d==0
        d = fix(s(hl)/16);
        dp = bitand(s(hl),15);
        hl=hl+1;
	end
	e=e-1;
	if  e==0
        e = fix(s(hl)/16);
        ep = bitand(s(hl),15);
        hl=hl+1;
    end

	ChA = bp;
	Vp(t) = vol(ChA+1)+vol(ChB+1)+vol(ChC+1); t=t+1;
	ChB = dp;
	Vp(t) = vol(ChA+1)+vol(ChB+1)+vol(ChC+1); t=t+1;
	ChC = ep;
	Vp(t) = vol(ChA+1)+vol(ChB+1)+vol(ChC+1); t=t+1;
    hlp = hlp-1;
end

figure(2)
plot(V)
hold on;
plot(Vp,'.r')

ro
msx guru
Berichten: 2245
Geplaatst: 29 December 2005, 13:27   
kewl, I think. 'ave to test it. but sound kewl. thanx Atrag
ARTRAG
msx master
Berichten: 1495
Geplaatst: 29 December 2005, 13:33   
I have sent two small examples at 'downloads@msx.org'
and I am preparing some others short encodedd wavs
ro
msx guru
Berichten: 2245
Geplaatst: 30 December 2005, 08:13   
looking forward 2 it!
ARTRAG
msx master
Berichten: 1495
Geplaatst: 31 December 2005, 16:40   
I have done a good number of tests and I have found that the main problem
of this approach is the size of the data.

My RLE imlementation is very poor as my data (at least some channels) can
be very irregular.
Sometimes the result, wrt an 8bit input, is 1.6 times the input, i.e. it is worst than
not encoding at all, that results 1.5 the original lenght....

I have found also that if I trade off SNR with # transition the size can drastically
decrease to 0.9 times the original sive but the SND tends to reduce of 2-3dB.

I will post soon the new algorithm for trading size with SNR.


snout

msx legend
Berichten: 4981
Geplaatst: 31 December 2005, 21:22   
Sorry for the delay on putting the files online, it have been extremely busy days... as soon as I recover from my new-years hangover, I'll put 'em in our downloads section!
ARTRAG
msx master
Berichten: 1495
Geplaatst: 01 Januari 2006, 05:55   
Take your time.
With jltursan we discovered that the msdos pcmenc.exe could not work on all the PC
due to the need of a .dll from matlab.
Probably (also for increasing the processing speed) it would be required a new version
of pcmenc.exe developed directly in C (I compiled the matlab program).

dvik
msx master
Berichten: 1262
Geplaatst: 01 Januari 2006, 10:21   
Hi ARTRAG,

Is the ideas in your new PSG sample player something totally new or a refinement of what was discussed in http://www.msx.org/forumtopic663p105.html ? It sounds like its somewhat new in the sense you time it quite well and then using an adaptive way of changing samples. I'm really looking forward to hear the result.
ARTRAG
msx master
Berichten: 1495
Geplaatst: 01 Januari 2006, 22:56   
This is a further development more that a refinement.
The approach is completely different from the one we developed (that one was a "sample by sample" encoding,
this one is a "sequence" encoding) but I have planned this solution (the fact that the sequence of PSG levels can
be found with a Viterbi algorithm) when we discovered the "spike" problem.

If you are interested send me an email and I'll send you some stuff to have a
taste of the whole thing.

dvik
msx master
Berichten: 1262
Geplaatst: 01 Januari 2006, 23:06   
Yeah, I remember we talked about having a more dynamic model than just a static table but iirc we didn't get any better results back then. I remember you talked about Viterbi back then too (dont think it was tested though) so I figured your new algorithm is doing something different. If you send a test version I'll try it both in emus and on real MSXes.

My email is: daniel AT vik DOT cc

Do you have any guess on how sensitive it is to timing, i.e. is it good enough if the average frequency is 8kHz or does each individual sample need to be spaced exactly 1/8000 seconds? The reason why I wonder this is if it would be possible to use it in a demo like waves.

Another thing that would be very cool is to write a mod player using this engine. Not sure if its possible but it would be fun to give it a try.

ARTRAG
msx master
Berichten: 1495
Geplaatst: 01 Januari 2006, 23:22   
Package sent!

dvik
msx master
Berichten: 1262
Geplaatst: 01 Januari 2006, 23:27   
Great! Thanks
ARTRAG
msx master
Berichten: 1495
Geplaatst: 01 Januari 2006, 23:31   
The algorithm has a heavy encoding part to be performed with a PC and a light z80 player
that uses RLE and does playback at 8KHz.

Actually unrolling the replayer in 8 branches, you can get on statistic basis "a lot" of spare time without
the need to jitter the sample time.
I was thinking to use the spare time for implementing a better encoding, now the principal problem is the data size.

Each sample needs 3*4 levels, without encoding you get an expansion of 1,5 times (encoding PCM at 8bit).
My RLE implementation is poor, unique values waste a full byte (instead of a nibble), so in average the compression
can be 1.2-1.6 (Yes even worst than without enconding)

This is the reason why i am working to a system for trading quality and higher compression (on the basis of a tunable parameter).

Now the quality ranges from 42dB to 36dB of SNR
while the encoding ratio wrt the original file can vary. If you start from 8bit PCM in the worst
case the RLE expand the result to 1.5 -1.6 times.
(consistent with the fact I encode PSG levels (3x4bit levels) and with the fact that my RLE implementation is very poor.)

With the new solution I am working on, acting on an optimization parameter, I can reduce the SNR increasing the compression.
In this case you can get also a reduction of the size having a final ratio of 0.8 or 0.9 and a
lower SNR of about 35dB.




 
Ga naar pagina ( 1 | 2 | 3 | 4 | 5 Volgende pagina )
 







(c) 1994 - 2008 Stichting MSX Resource Center. MSX is een trademark van MSX Association.