-
98282675b0
Fixed embeding mistake
main
k
2026-03-05 15:13:34 -05:00
-
8d8fb8c212
experimental "lin" blocks insted of attention sparely
k
2026-02-27 09:13:24 -05:00
-
89c9d01cb8
More training with less heads
k
2026-02-27 09:13:01 -05:00
-
dc231ae703
fixed bos token being prepended twice
k
2026-02-27 09:10:34 -05:00
-
a0cd98876c
added gitignore
k
2026-01-13 21:11:32 -05:00
-
0537a5df64
changed chat dataset.
k
2026-01-09 17:30:34 -05:00
-
c78a31362a
set to gpt2 hyprs
k
2026-01-09 12:45:01 -05:00
-
496916f428
added fine-tuning
k
2026-01-07 13:01:06 -05:00
-
121640bab6
updated hypr for my gpu
k
2026-01-07 12:59:44 -05:00
-
6f037c4a9a
Quick training script
k
2026-01-07 02:14:09 -05:00
-
7f25dff1d1
Fix errors
k
2026-01-07 02:13:08 -05:00
-
007c96e91b
Simple log functions
k
2026-01-07 01:25:47 -05:00
-
6daa8ec46c
Added code to generate training batches
k
2026-01-07 01:15:18 -05:00
-
229c564811
CosineAnnealing with optimizer Group
k
2026-01-07 00:26:04 -05:00
-
478010c8cc
added Positional encodeings
k
2026-01-06 21:38:12 -05:00
-
3b590b3ce7
added dropout to ffn
k
2026-01-06 21:26:51 -05:00
-
957aad2239
Implimented Transformer(decode only)
k
2026-01-06 21:26:24 -05:00
-
23f62c7e64
Implimented TransformerBlock
k
2026-01-06 19:53:37 -05:00
-
77aa0de0eb
Implimented MultiHeadAttention
k
2026-01-06 19:41:12 -05:00
-
c4e5e332ba
fixed cast in ffn
k
2026-01-06 19:40:45 -05:00
-
d6b9f45fcc
Implimented Feed Forward Netwok
k
2026-01-06 18:31:04 -05:00