Schrijver
| How is basic stored in the memory?
|
Vampier msx addict Berichten: 493 | Geplaatst: 14 Oktober 2007, 11:23   |
hi, I'm trying to write a TCL script for openMSX that can take a basic listing directly from the memory. In order to do this I need to find out how basic listings are stored in the memory:
I already got some info from Bifi. Programs mostly start on 0x8000 (if not otherwise descripted by 2 addresses 0xf676 and 0xf677 )
I also found this:
ABS 06 2E82 DATA 84 485B
AND F6 DEF 97 5010
ASC 15 680B DEFDBL AE 4721
ATN 0E 2A14 DEFINT AC 471B
ATTR$ E9 7C43 DEFSNG AD 471E
AUTO A9 49B5 DEFSTR AB 4718
DELETE A8 53E2
BASE C9 7B5A/7BCB DIM 86 5E9F
BEEP C0 00C0 DRAW BE 5D6E
BIN$ 1D 65FF DSKF 26 7C39
BLOAD CF 6EC6 DSKI$ EA 7C3E
BSAVE D0 6E92 DSKO$ D1 7C16
CALL CA 55A8 ELSE A1 485D
CDBL 20 303A END 81 63EA
CINT 1E 2F8A ERASE A5 6477
CIRCLE BC 5B11 ERL E1 4E0B
CHR$ 16 681B ERR E2 4DFD
CLEAR 92 64AF ERROR A6 49AA
CLOAD 9B 703F EOF 2B 6D25
CLS 9F 00C3 EQU F9
CMD D7 7C34 EXP 0B 2B4A
COLOR BD 7980
CONT 99 6424 FIELD B1 7C52
COPY D6 7C2F FILES B7 6C2F
COS 0C 2993 FIX 21 30BE
CRSLIN E8 790A FN DE 5040
CSAVE 9A 6F87 FOR 82 4524
CSNG 1F 2FB2 FPOS 27 6D39
CVD 2A 7C70 FRE 0F 69F2
CVI 28 7C66
CVS 29 7C6B
-------------------------------------------------------
GET B2 7758 MAX CD 7E4B
GOSUB 8D 47B2 MERGE B6 6B5E
GOTO 89 47E8 MID$ 03 689A
MKD$ 30 7C61
HEX$ 1B 65FA MKI$ 2E 7C57
MKS$ 2F 7C5C
INKEY$ EC 7347 MOD FB
INP 10 4001 MOTOR CE 73B7
INPUT 85 4B6C
INPUT$ 6C87 NAME D3 7C20
INSTR E5 68EB NEW 94 6286
INT 05 30CF NEXT 83 6527
IMP FA NOT E0 4F63
IPL D5 7C2A
IF 8B 49E5 OCT$ 1A 65F5
OFF EB
KEY CC 786C ON 95 48E4
KILL D4 7C25 OPEN B0 6AB7
OR F7
LEN 12 67FE OUT 9C 4016
LEFT$ 01 6861
LET 88 4880 PAD 25 7969
LINE AF 4B0E PAINT BF 59C5
LIST 93 522E PDL 24 795A
LFILES BB 6C2A PEEK 17 541C
LLIST 9E 5229 PLAY C1 73E5/791B
LOC 2C 6D03 POINT ED 5803
LOG 0A 2A72 POKE 98 5423
LDF 2D 6D14 POS 11 4FCC
LOAD B5 6B5D PRESET C3 57E5
LOCATE D8 7766 PRINT 91 4A24
LPOS 1C 4FC7 PSET C2 57EA
LPRINT 9D 4A1D
LSET B8 7C48
-------------------------------------------------------
READ 87 4B9F TAB DB
RENUM AA 5468 TAN 0D 29FB
REM 8F 485D THEN DA
RESTORE 8C 633C TIME CB 7911/7900
RESUME A7 4950 TO D9
RIGHT$ 02 6891 TRON A2 6438
RETURN 8E 4821 TROFF A3 6439
RND 08 2BDF
RSET B9 7C4D
RUN 8A 479E USING E4
USR DD 4FD5
SCREEN C5 79CC
SET D2 7C1B
SGN 04 2E97 VAL 14 68BB
SIN 09 29AC VARPTR E7 4E41
SAVE BA 6BA3 VDP C8 7B37/7B47
SPC DF VPOKE C6 7BE2
SPACE$ 19 6848 VPEEK 18 7EF5
SPRITE C7 7A48/7A84
SOUND C4 73CA
SQR 07 2AFF WAIT 96 401C
STEP DC WIDTH A0 51C9
STICK 22 7940
STOP 90 63E3 XOR F8
STRIG 23 794C
STRING$ E3 6829
STR$ 13 6604
SWAP A4 643E
Then I wrote this
100 S=0
110 FOR I=0 TO 250
120 A$=HEX$(PEEK(&H8000+I))
130 'PRINT A$+"|"+CHR$(VAL("&h"+A$)),S
140 IF A$="EF" THEN PRINT"=";:NEXT
150 IF A$="20" AND S=0 THEN S=1:NEXT
160 IF A$="20" AND S=1 THEN S=0:NEXT
170 IF S=1 THEN PRINT CHR$(VAL("&h"+A$));:NEXT
180 IF A$="80" THEN PRINT CHR$(13)+CHR$(13)+"<line num>"+HEX$(PEEK(&H8000+I+1)*PEEK(&H8000+I+2))+"</line num>";:I=I+2
190 IF A$="E0" THEN PRINT "NOT";
200 IF A$="A8" THEN PRINT "DELETE";
210 IF A$="30" THEN PRINT "MKD$";
220 IF A$="2F" THEN PRINT "MKS$";
230 IF A$="EB" THEN PRINT "OFF";
240 IF A$="F7" THEN PRINT "OR";
250 IF A$="97" THEN PRINT "DEF";
260 IF A$="86" THEN PRINT "DIM";
270 IF A$="6" THEN PRINT "ABS";
280 IF A$="F6" THEN PRINT "AND";
290 IF A$="15" THEN PRINT "ASC";
300 IF A$="0E" THEN PRINT "ATN";
310 IF A$="E9" THEN PRINT "ATTR$";
320 IF A$="A9" THEN PRINT "AUTO";
330 IF A$="C9" THEN PRINT "BASE";
340 IF A$="C0" THEN PRINT "BEEP";
350 IF A$="1D" THEN PRINT "BIN$";
360 IF A$="CF" THEN PRINT "BLOAD";
370 IF A$="D0" THEN PRINT "BSAVE";
380 IF A$="CA" THEN PRINT "CALL";
390 IF A$="20" THEN PRINT "CDBL";
400 IF A$="16" THEN PRINT "CHR$";
410 IF A$="1E" THEN PRINT "CINT";
420 IF A$="BC" THEN PRINT "CIRCLE";
430 IF A$="92" THEN PRINT "CLEAR";
440 IF A$="9B" THEN PRINT "CLOAD";
450 IF A$="9F" THEN PRINT "CLS";
460 IF A$="D7" THEN PRINT "CMD";
470 IF A$="BD" THEN PRINT "COLOR";
480 IF A$="99" THEN PRINT "CONT";
490 IF A$="D6" THEN PRINT "COPY";
500 IF A$="0C" THEN PRINT "COS";
510 IF A$="E8" THEN PRINT "CRSLIN";
520 IF A$="9A" THEN PRINT "CSAVE";
530 IF A$="1F" THEN PRINT "CSNG";
540 IF A$="2A" THEN PRINT "CVD";
550 IF A$="28" THEN PRINT "CVI";
560 IF A$="29" THEN PRINT "CVS";
570 IF A$="84" THEN PRINT "DATA";
580 IF A$="AE" THEN PRINT "DEFDBL";
590 IF A$="AC" THEN PRINT "DEFINT";
600 IF A$="AD" THEN PRINT "DEFSNG";
610 IF A$="AB" THEN PRINT "DEFSTR";
620 IF A$="BE" THEN PRINT "DRAW";
630 IF A$="26" THEN PRINT "DSKF";
640 IF A$="EA" THEN PRINT "DSKI$";
650 IF A$="D1" THEN PRINT "DSKO$";
660 IF A$="A1" THEN PRINT "ELSE";
670 IF A$="81" THEN PRINT "END";
680 IF A$="2B" THEN PRINT "EOF";
690 IF A$="F9" THEN PRINT "EQU";
700 IF A$="A5" THEN PRINT "ERASE";
710 IF A$="E1" THEN PRINT "ERL";
720 IF A$="E2" THEN PRINT "ERR";
730 IF A$="A6" THEN PRINT "ERROR";
740 IF A$="0B" THEN PRINT "EXP";
750 IF A$="B1" THEN PRINT "FIELD";
760 IF A$="B7" THEN PRINT "FILES";
770 IF A$="21" THEN PRINT "FIX";
780 IF A$="DE" THEN PRINT "FN";
790 IF A$="82" THEN PRINT "FOR";
800 IF A$="27" THEN PRINT "FPOS";
810 IF A$="0F" THEN PRINT "FRE";
820 IF A$="B2" THEN PRINT "GET";
830 IF A$="8D" THEN PRINT "GOSUB";
840 IF A$="89" THEN PRINT "GOTO";
850 IF A$="1B" THEN PRINT "HEX$";
860 IF A$="8B" THEN PRINT "IF";
870 IF A$="FA" THEN PRINT "IMP";
880 IF A$="EC" THEN PRINT "INKEY$";
890 IF A$="10" THEN PRINT "INP";
900 IF A$="85" THEN PRINT "INPUT";
910 'IF A$="6C87" THEN PRINT "INPUT$"; -- not valid
920 IF A$="E5" THEN PRINT "INSTR";
930 IF A$="5" THEN PRINT "INT";
940 IF A$="D5" THEN PRINT "IPL";
950 IF A$="CC" THEN PRINT "KEY";
960 IF A$="D4" THEN PRINT "KILL";
970 IF A$="2D" THEN PRINT "LDF";
980 IF A$="1" THEN PRINT "LEFT$";
990 IF A$="12" THEN PRINT "LEN";
1000 IF A$="88" THEN PRINT "LET";
1010 IF A$="BB" THEN PRINT "LFILES";
1020 IF A$="AF" THEN PRINT "LINE";
1030 IF A$="93" THEN PRINT "LIST";
1040 IF A$="9E" THEN PRINT "LLIST";
1050 IF A$="B5" THEN PRINT "LOAD";
1060 IF A$="2C" THEN PRINT "LOC";
1070 IF A$="D8" THEN PRINT "LOCATE";
1080 IF A$="0A" THEN PRINT "LOG";
1090 IF A$="1C" THEN PRINT "LPOS";
1100 IF A$="9D" THEN PRINT "LPRINT";
1110 IF A$="B8" THEN PRINT "LSET";
1120 IF A$="CD" THEN PRINT "MAX";
1130 IF A$="B6" THEN PRINT "MERGE";
1140 IF A$="3" THEN PRINT "MID$";
1150 IF A$="2E" THEN PRINT "MKI$";
1160 IF A$="FB" THEN PRINT "MOD";
1170 IF A$="CE" THEN PRINT "MOTOR";
1180 IF A$="D3" THEN PRINT "NAME";
1190 IF A$="94" THEN PRINT "NEW";
1200 IF A$="83" THEN PRINT "NEXT";
1210 IF A$="1A" THEN PRINT "OCT$";
1220 IF A$="95" THEN PRINT "ON";
1230 IF A$="B0" THEN PRINT "OPEN";
1240 IF A$="9C" THEN PRINT "OUT";
1250 IF A$="25" THEN PRINT "PAD";
1260 IF A$="BF" THEN PRINT "PAINT";
1270 IF A$="24" THEN PRINT "PDL";
1280 IF A$="17" THEN PRINT "PEEK";
1290 IF A$="C1" THEN PRINT "PLAY";
1300 IF A$="ED" THEN PRINT "POINT";
1310 IF A$="98" THEN PRINT "POKE";
1320 IF A$="11" THEN PRINT "POS";
1330 IF A$="C3" THEN PRINT "PRESET";
1340 IF A$="91" THEN PRINT "PRINT";
1350 IF A$="C2" THEN PRINT "PSET";
1360 IF A$="87" THEN PRINT "READ";
1370 IF A$="8F" THEN PRINT "REM";
1380 IF A$="AA" THEN PRINT "RENUM";
1390 IF A$="8C" THEN PRINT "RESTORE";
1400 IF A$="A7" THEN PRINT "RESUME";
1410 IF A$="8E" THEN PRINT "RETURN";
1420 IF A$="2" THEN PRINT "RIGHT$";
1430 IF A$="8" THEN PRINT "RND";
1440 IF A$="B9" THEN PRINT "RSET";
1450 IF A$="8A" THEN PRINT "RUN";
1460 IF A$="BA" THEN PRINT "SAVE";
1470 IF A$="C5" THEN PRINT "SCREEN";
1480 IF A$="D2" THEN PRINT "SET";
1490 IF A$="4" THEN PRINT "SGN";
1500 IF A$="9" THEN PRINT "SIN";
1510 IF A$="C4" THEN PRINT "SOUND";
1520 IF A$="19" THEN PRINT "SPACE$";
1530 IF A$="DF" THEN PRINT "SPC";
1540 IF A$="C7" THEN PRINT "SPRITE";
1550 IF A$="7" THEN PRINT "SQR";
1560 IF A$="DC" THEN PRINT "STEP";
1570 IF A$="22" THEN PRINT "STICK";
1580 IF A$="90" THEN PRINT "STOP";
1590 IF A$="13" THEN PRINT "STR$";
1600 IF A$="23" THEN PRINT "STRIG";
1610 IF A$="E3" THEN PRINT "STRING$";
1620 IF A$="A4" THEN PRINT "SWAP";
1630 IF A$="DB" THEN PRINT "TAB";
1640 IF A$="0D" THEN PRINT "TAN";
1650 IF A$="DA" THEN PRINT "THEN";
1660 IF A$="CB" THEN PRINT "TIME";
1670 IF A$="D9" THEN PRINT "TO";
1680 IF A$="A3" THEN PRINT "TROFF";
1690 IF A$="A2" THEN PRINT "TRON";
1700 IF A$="E4" THEN PRINT "USING";
1710 IF A$="DD" THEN PRINT "USR";
1720 IF A$="14" THEN PRINT "VAL";
1730 IF A$="E7" THEN PRINT "VARPTR";
1740 IF A$="C8" THEN PRINT "VDP";
1750 IF A$="18" THEN PRINT "VPEEK";
1760 IF A$="C6" THEN PRINT "VPOKE";
1770 IF A$="96" THEN PRINT "WAIT";
1780 IF A$="A0" THEN PRINT "WIDTH";
1790 IF A$="F8" THEN PRINT "XOR";
1800 NEXT
For some reason it almost works, but I seem to be missing vital parts of information. Can someone help me?
|
|
Sonic_aka_T
 msx guru Berichten: 2262 | Geplaatst: 14 Oktober 2007, 11:33   |
Clueless as to how all this works, but maybe you should have a look at the SAVE "",A routine from basic itself?
|
|
AuroraMSX
 msx master Berichten: 1231 | Geplaatst: 14 Oktober 2007, 11:38   |
Cool idea
I'm missing some code that evaluates constants (strings, numbers etc). I guess that's what messing up your output...
And just do use that system variable at &HF676 that tells you where BASIC begins -- much more reliable than just starting at &h8000! |
|
jltursan msx professional Berichten: 847 | Geplaatst: 14 Oktober 2007, 12:08   |
The BASIC interpreter tokeniser routine maybe could help you (quoted from the MSX Red Book):
Quote:
|
This routine is used by the Interpreter Mainloop to tokenize
a line of text. On entry register pair HL points to the first
text character in BUF. On exit the tokenized line is in KBUF,
register pair BC holds its length and register pair HL points
to its start.
Except after opening quotes or after the "REM", "CALL" or
"DATA" keywords any string of characters matching a keyword is
replaced by that keyword's token. Lower case alphabetics are
changed to upper case for keyword comparison. The character "?"
is replaced by the "PRINT" token (91H) and the character "'" by
":" (3AH), "REM" token (8FH), "'" token (E6H). The "ELSE" token
(A1H) is preceded by a statement separator (3AH). Any other
miscellaneous characters in the text are copied without
alteration except that lower case alphabetics are converted to
upper case. Those tokens smaller than 80H, the function tokens,
cannot be stored directly in KBUF as they will conflict with
ordinary text. Instead the sequence FFH, token+80H is used.
Numeric constants are first converted into one of the
standard types in DAC (3299H). They are then stored in one of
several ways depending upon their type and magnitude, the
general idea being to minimize memory usage:
0BH LSB MSB ................... Octal number
0CH LSB MSB ................... Hex number
11H to 1AH .................... Integer 0 to 9
0FH LSB ....................... Integer 10 to 255
1CH LSB MSB ................... Integer 256 to 32767
1DH EE DD DD DD ............... Single Precision
1FH EE DD DD DD DD DD DD DD ... Double Precision
There is no specific token for binary numbers, these are left
as character strings. This would appear to be a legacy from
earlier versions of Microsoft BASIC. Any sign prefixing a
number is regarded as an operator and is stored as a separate
token, negative numbers are not produced during tokenization.
As double precision numbers occupy so much space a line
containing too many, for example PRINT 1#,1#,1# etc. may cause
KBUF to fill up. If this happens a "Line buffer overflow" error
is generated.
Any number following one of the keyword tokens in the table
at 43B5H is considered to be a line number operand and is
stored with a different token:
0DH LSB MSB ................... Pointer
0EH LSB MSB ................... Line number
During tokenization only the normal type (0EH) is generated,
when a program actually runs these line number operands are
converted to the address pointer type (0DH).
|
The operand tokens are, if I'm not wrong:
79H ... + 46H ... OR
79H ... - 3CH ... XOR
7CH ... * 32H ... EQV
7CH ... / 28H ... IMP
7FH ... ^ 7AH ... MOD
50H ... AND 7BH \
And this is an example of a little BASIC program detokenised:
10 KEYOFF:SCREEN 1:WIDTH 32:
20 FOR I=&H1800 TO 6911
30 A=INT(RND(-TIME)*256)
40 VPOKE I,A
50 NEXT I
60 GOTO 60
Memory contents at &H8000:
(00)
(12 80)(0a 00)(cc)(eb)(3a)(c5)(20)(12)(3a)(a0)(20)(0f 20)(3a)(00)
8012 10 KEY OFF : SCREEN 1 : WIDTH 32 :
(24 80)(14 00)(82)(20)(49)(ef)(0c)(00 18)(20)(d9)(20)(1c)(ff 1a) 00
8024 20 FOR I = &H 1800 TO 6911
(39 80)(1e 00)(41)(ef)(ff 85)(28)(ff 88)(28)(f2)(cb)(29)(f3)(1c 00 01)(29)(00)
8039 30 A = INT ( RND ( - TIME) * 256 )
(43 80)(28 00)(c6)(20)(49)(2c)(41)(00)
8043 40 VPOKE I , A
(4b 80)(32 00)(83)(20)(49)(00)
804B 50 NEXT I
(55 80)(3c 00)(89)(20)(0d 4a 80)
8055 60 GOTO POINTER:804A
(00 00)
END OF LISTING
|
|
PingPong msx professional Berichten: 882 | Geplaatst: 14 Oktober 2007, 16:10   |
How basic is stored in memory? In a very inefficient way (slooooooooooooooooooooooooooooooooooooow)
MSXBASIC is one of the slowest basic for 8 bit machines (thx, m$)
|
|
Sonic_aka_T
 msx guru Berichten: 2262 | Geplaatst: 14 Oktober 2007, 17:19   |
Quote:
| How basic is stored in memory? In a very inefficient way (slooooooooooooooooooooooooooooooooooooow)
MSXBASIC is one of the slowest basic for 8 bit machines (thx, m$)
|
It's also the most powerful BASIC... (as in complete) |
|
AuroraMSX
 msx master Berichten: 1231 | Geplaatst: 14 Oktober 2007, 18:38   |
Quote:
| How basic is stored in memory? In a very inefficient way (slooooooooooooooooooooooooooooooooooooow)
|
Actually, the tokenized version is not that inefficient and very similar to how other BASICs store their programs in RAM.
Quote:
| It's also the most powerful BASIC... (as in complete)
|
No it's not. Have a look at the BBC BASIC: thats like MSX BASIC plus a couple of nice features like procedures and loop constructs like WHILE/DO and DO/UNTIL... |
|
Vampier msx addict Berichten: 493 | Geplaatst: 14 Oktober 2007, 18:53   |
Thanks for the long reply there
TCL-ing as we speak  |
|
dvik msx master Berichten: 1303 | Geplaatst: 14 Oktober 2007, 20:06   |
A much easier way of getting the basic listing out of an emulator is to do:
LLIST
which sends the listing to the printer port. Then in blueMSX or openMSX you save the printer output to a file.
|
|
Sonic_aka_T
 msx guru Berichten: 2262 | Geplaatst: 14 Oktober 2007, 20:19   |
I always use SAVE "",A with dirasdisk...
|
|
cax
 msx professional Berichten: 1011 | Geplaatst: 14 Oktober 2007, 20:28   |
AFAIK there already exist some tools that untokenize basic programs, and they even made their appearance on the main news page in the past...
|
|
Vampier msx addict Berichten: 493 | Geplaatst: 14 Oktober 2007, 20:39   |
Come on.. think outside the box... what's the whole purpose of me doing this?  |
|
Metalion msx freak Berichten: 215 | Geplaatst: 14 Oktober 2007, 20:51   |
Quote:
| Come on.. think outside the box... what's the whole purpose of me doing this? 
|
I must say I've got absolutely no idea    |
|
Vampier msx addict Berichten: 493 | Geplaatst: 14 Oktober 2007, 20:52   |
read/write directly to emulator ?
|
|
multi msx lover Berichten: 67 | Geplaatst: 15 Oktober 2007, 04:08   |
so you can make a very nice full featured msx-basic editor on the PC that fires up the emulator and starts executing when you press the run button?
|
|
|
|
|