~alcinnz/rhapsode

0f61fa288160dc0c605e96418648277f2597a05f — Adrian Cochrane 4 years ago 2481b8c
Manually linewrap the doc files.
M docs/CSS-Speech-Tutorial.md => docs/CSS-Speech-Tutorial.md +32 -13
@@ 1,30 1,48 @@
**Wanted:** Guidance on creating a great audio theme. Maybe based on voice acting or public speaking theory.
**Wanted:** Guidance on creating a great audio theme. Maybe based on voice acting
or public speaking theory.

Rhapsode still lets you apply CSS styles to your webpages, but since it outputs audio rather than video it supports a different set of CSS properties. This page provides an overview of these properties.
Rhapsode still lets you apply CSS styles to your webpages, but since it outputs
audio rather than video it supports a different set of CSS properties. This page
provides an overview of these properties.

## Should it be spoken?
You can use the `speak` property to determine whether an HTML element should be read aloud or not, and the `speak-as` property to determine how it reads digits and/or punctuation.
You can use the `speak` property to determine whether an HTML element should be
read aloud or not, and the `speak-as` property to determine how it reads digits
and/or punctuation.

Setting `speak: never` is not the same as setting `voice-volume: silent` as the latter still takes up the same ammount of time as it would've to read the text aloud.
Setting `speak: never` is not the same as setting `voice-volume: silent` as the
latter still takes up the same ammount of time as it would've to read the text
aloud.

## The Voice
You can use the `voice-family` attribute to select a voice either by age/gender/variant or by it's name. Just like font-family this'll make a big difference to the look of your page.
You can use the `voice-family` attribute to select a voice either by age/gender/variant
or by it's name. Just like font-family this'll make a big difference to the look
of your page.

## Speaking Style
You can alter the voice you choose by varying it's volume, rate, pitch, range, and stress. Doing so helps people pay attention, especially if it reinforces the meaning of your text.
You can alter the voice you choose by varying it's volume, rate, pitch, range,
and stress. Doing so helps people pay attention, especially if it reinforces the
meaning of your text.

### Keywords & Offsets
All the speaking style properties provides keywords you can use instead of a number. In which case write a number after a keyword to represent an offset from that keyword.
All the speaking style properties provides keywords you can use instead of a
number. In which case write a number after a keyword to represent an offset from
that keyword.

## The CSS Speech "Box Model"
On either end of your text you can place an audio cue to identify it, and on either end of those you can insert additional silence. If two silences are directly adjacent, the smaller one will be removed.
On either end of your text you can place an audio cue to identify it, and on
either end of those you can insert additional silence. If two silences are
directly adjacent, the smaller one will be removed.

The inner pauses are called the element's `rest` and the outer ones are called it's `pause`.
The inner pauses are called the element's `rest` and the outer ones are called
it's `pause`.

The user agent stylesheet, for example, uses audio cues to indicate list bullets and links. And silence functions exactly like whitespace in a visual browser.
The user agent stylesheet, for example, uses audio cues to indicate list bullets
and links. And silence functions exactly like whitespace in a visual browser.

## Text Generation
Rhapsode supports (some of) the same text generation attributes as visual browsers, namely:
Rhapsode supports (some of) the same text generation attributes as visual
browsers, namely:

* `counter-reset`
* `counter-increment`


@@ 33,9 51,10 @@ Rhapsode supports (some of) the same text generation attributes as visual browse

Though more may be added in the future.

However unlike visual browsers you can apply the `content` property the element itself to replace it's own children.
However unlike visual browsers you can apply the `content` property the element
itself to replace it's own children.

---

* [CSS3 Speech Module](https://drafts.csswg.org/css-speech-1/) (Retired W3C Note)
* [MDN on CSS Counters](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Lists_and_Counters/Using_CSS_counters)
\ No newline at end of file
* [MDN on CSS Counters](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Lists_and_Counters/Using_CSS_counters)

M docs/Designing-for-Rhapsode.md => docs/Designing-for-Rhapsode.md +37 -13
@@ 1,41 1,65 @@
## 1. Write Semantic (X)HTML5
Use *changes* in Rhapsode's voice to *enhance* the communication of your text, that'll help people pay attention to it. To do so start by making meaningful use of the (X)HTML5 tags, and then (if you want) you can get more specific with Rhapsode-specific CSS. At the very least provide good links.
Use *changes* in Rhapsode's voice to *enhance* the communication of your text,
that'll help people pay attention to it. To do so start by making meaningful
use of the (X)HTML5 tags, and then (if you want) you can get more specific with
Rhapsode-specific CSS. At the very least provide good links.

If you've got quotes, marking them with `<q>` or `<blockquote>` tags rather than quote marks will render much more clearly in Rhapsode. Rhapsode will then render these in a new voice, while visual browsers will still render them as the appropriate quote marks for your language.
If you've got quotes, marking them with `<q>` or `<blockquote>` tags rather than
quote marks will render much more clearly in Rhapsode. Rhapsode will then render
these in a new voice, while visual browsers will still render them as the
appropriate quote marks for your language.

## 2. Declare Your Page's Language
If Rhapsode knows what language your page is written in, it can alter some of the phrases it inserts to match (not that it does yet). To do so use the `lang` or `xml:lang` attributes on the root `<html>` element.
If Rhapsode knows what language your page is written in, it can alter some of
the phrases it inserts to match (not that it does yet). To do so use the `lang`
or `xml:lang` attributes on the root `<html>` element.

This mostly just applies if you've got forms on the page.

## 3. Don't Rely on JavaScript
Rhapsode doesn't support JavaScript, as it's APIs don't map cleanly to Rhapsode's experience. And because Rhapsode doesn't like it's complexity or security model.
Rhapsode doesn't support JavaScript, as it's APIs don't map cleanly to Rhapsode's
experience. And because Rhapsode doesn't like it's complexity or security model.

As such if your pages break when JavaScript's disabled, they'll almost certainly break in Rhapsode. The exception is for simple scripts that alter the style of an existing element, because Rhapsode follows an alternative set of CSS properties.
As such if your pages break when JavaScript's disabled, they'll almost certainly
break in Rhapsode. The exception is for simple scripts that alter the style of an
existing element, because Rhapsode follows an alternative set of CSS properties.

## 4. Avoid Excess Text
Listeners may tune out if you don't get straight to the point and stay on topic. As for styles, it doesn't matter much what you do as long as you're consistant.
Listeners may tune out if you don't get straight to the point and stay on topic.
As for styles, it doesn't matter much what you do as long as you're consistant.

## Navigation
There are a couple of additional points when it comes to navigation.

**NOTE:** Navigation has not yet been implemented.

### 5. Never Override `:link {cue-before}`
Visitors will be relying on this audio cue to know this is a link they can follow, and override it defeats the purpose of the link.
### 5. Never Override `:link {cue-after}`
Visitors will be relying on this audio cue to know this is a link they can follow,
and override it defeats the purpose of the link.

In fact, you are not permitted to do so. This still rule is `!important`.

### 6. Don't Rely on Navbar (Or Adjust Styles)
Because excess text can bore the visitor, Rhapsode defaults to not read out your `<nav>` tag. However it'll still allow visitors to follow these links if they can intuit that they exist, so it's still very useful to provide a navbar on your pages.
Because excess text can bore the visitor, Rhapsode defaults to not read out your
`<nav>` tag. However it'll still allow visitors to follow these links if they
can intuit that they exist, so it's still very useful to provide a navbar on
your pages.

As such you should make sure that visitors can navigate your entire site without using the navbar. With the navbar itself acting as an enhancement but not a necessity.
As such you should make sure that visitors can navigate your entire site without
using the navbar. With the navbar itself acting as an enhancement but not a
necessity.

Or alternatively you can override this default via the CSS `nav {speak: always}`. I would suggest applying this style *only* to your homepage, so it doesn't get in the way of your site's text.
Or alternatively you can override this default via the CSS `nav {speak: always}`.
I would suggest applying this style *only* to your homepage, so it doesn't get
in the way of your site's text.

## 7. Reset All Properties In Your Voice Stylesheets
Rhapsode reserves the right to adjust it's user agent stylesheet to better suite the majority of sites not targetting Rhapsode specifically. As such you should not rely on these defaults staying as they are when styling your own pages for it.
Rhapsode reserves the right to adjust it's user agent stylesheet to better suite
the majority of sites not targetting Rhapsode specifically. As such you should not
rely on these defaults staying as they are when styling your own pages for it.

The good news is that there's not that many properties to reset to `initial`.

---

[elementary OS's blog](https://blog.elementary.io/) sounds great in Rhapsode, for example.
\ No newline at end of file
[elementary OS's blog](https://blog.elementary.io/) sounds great in Rhapsode, for example.

M docs/Hypothetical/Custom-CPU-Design.md => docs/Hypothetical/Custom-CPU-Design.md +58 -20
@@ 1,4 1,5 @@
This page describes some hypothetical hardware designed specifically to run a Rhapsode-like web browser. There are no plans to build *this* hardware.
This page describes some hypothetical hardware designed specifically to run a
Rhapsode-like web browser. There are no plans to build *this* hardware.

However this hypothetical may help to clarify how Rhapsode works.



@@ 20,32 21,55 @@ A navigation task is performed by:
13. Convert the SSML/phonemes into audio manipulations
14. Output raw audio

Almost all of those steps are straightforward format conversions (possibly via instructions extracted from HTML) or map lookups. So that's what I'll design here.
Almost all of those steps are straightforward format conversions (possibly via
instructions extracted from HTML) or map lookups. So that's what I'll design here.

The main exceptions are TLS, TCP, & especially voice recognition. TLS requires circuitry that can perform en/de-cryption. Whilst TCP requires cancellable timeouts, randomness, and coroutines. Voice recognition will be addressed later.
The main exceptions are TLS, TCP, & especially voice recognition. TLS requires
circuitry that can perform en/de-cryption. Whilst TCP requires cancellable
timeouts, randomness, and coroutines. Voice recognition will be addressed later.

## Parsing
Let's say network, buttons, and (voice-recognized) audio is written into a ringbuffer  by those input devices. Each 4bits(?) of which would navigate a graph describing the syntax being matched.
Let's say network, buttons, and (voice-recognized) audio is written into a
ringbuffer  by those input devices. Each 4bits(?) of which would navigate a
graph describing the syntax being matched.

The nodes in that graph could "call" other syntaxes or tries (for more complex syntaxes) before "returning" to where it left off by pushing and popping a stack.
The nodes in that graph could "call" other syntaxes or tries (for more complex
syntaxes) before "returning" to where it left off by pushing and popping a stack.

If the parsing CPU encounters a node that's not in it's cache memory (a "cache miss"), I'd have it immediately load it in from memory. And since this CPU focuses on format conversions anyways, it could be repurposed to decompress/decode the new instructions.
If the parsing CPU encounters a node that's not in it's cache memory (a
"cache miss"), I'd have it immediately load it in from memory. And since
this CPU focuses on format conversions anyways, it could be repurposed to
decompress/decode the new instructions.

## Reformatting
Each of the parsing rules you can call could optionally have a corresponding instructions for what to do upon pop. So that those instructions could be prefetched upon push and enqueued upon pop. There may also be an "echo" shorthand in this process.
Each of the parsing rules you can call could optionally have a corresponding
instructions for what to do upon pop. So that those instructions could be
prefetched upon push and enqueued upon pop. There may also be an "echo"
shorthand in this process.

Those instructions in turn would output bytes to external hardware, cached disk pages, and/or other programs as tracked in a "capabilities stack". Bytes written to other programs would be queued up in an "idle" ringbuffer to be dequeued when there's no external input.
Those instructions in turn would output bytes to external hardware, cached disk
pages, and/or other programs as tracked in a "capabilities stack". Bytes written
to other programs would be queued up in an "idle" ringbuffer to be dequeued when
there's no external input.

To compile machine code, update caches, add a timeout, or sort/dedup output there'd also need to be an instruction to rewrite specified page(s) of memory. This could be handled using the same circuits as cache misses during parsing, or it could trigger the interrupt only once the fetch has completed.
To compile machine code, update caches, add a timeout, or sort/dedup output
there'd also need to be an instruction to rewrite specified page(s) of memory.
This could be handled using the same circuits as cache misses during parsing,
or it could trigger the interrupt only once the fetch has completed.

---

Occasionally an ALU would be required for encryption, comparison, checksums, sound effects, etc.
Occasionally an ALU would be required for encryption, comparison, checksums,
sound effects, etc.

Coroutines would be required for TLS and (navigatable) audio output. Saves could be done by writing a pointer to it's stack(s) to another page. And restores could occur via parsing cache miss once it's been looked up.
Coroutines would be required for TLS and (navigatable) audio output. Saves could
be done by writing a pointer to it's stack(s) to another page. And restores could
occur via parsing cache miss once it's been looked up.

## Memory Blocks
This hypothetical CPU would require very little circuitry, relying almost entirely on multiple independant chunks of memory that can be accessed concurrently. It should be trivial to build on a FPGA.
This hypothetical CPU would require very little circuitry, relying almost entirely
on multiple independant chunks of memory that can be accessed concurrently. It
should be trivial to build on a FPGA.

Specially it'd include memory blocks for:



@@ 58,14 82,20 @@ Specially it'd include memory blocks for:
7. Staging areas/stacks
8. Idle queue

There may be a second core that turns on when the idle queue overflows, which would have (some of) it's own dedicated memory blocks. Also a bitmask could be used for allocate overflow and other pages.
There may be a second core that turns on when the idle queue overflows, which
would have (some of) it's own dedicated memory blocks. Also a bitmask could be
used for allocate overflow and other pages.

Furthermore the queues and stacks could have near-perfect cache hit rates, and would rarely overflow to memory.
Furthermore the queues and stacks could have near-perfect cache hit rates, and
would rarely overflow to memory.

## Voice Recognition
There are two approaches to voice recognition I'm familiar with: Mozilla Deep Voice & CMU Sphinx. What's described here caters to both approaches, whilst the circuit described above caters to neither.
There are two approaches to voice recognition I'm familiar with: Mozilla Deep
Voice & CMU Sphinx. What's described here caters to both approaches, whilst the
circuit described above caters to neither.

No one understands how any specific neural network (like Mozilla Deep Voice) works, but I can expand upon how CMU Sphinx works:
No one understands how any specific neural network (like Mozilla Deep Voice)
works, but I can expand upon how CMU Sphinx works:

1. Compute "feature vectors" to describe each sliver of audio.
2. Use "Hidden Markov Models" (HMMs) to convert those feature vectors to "phones".


@@ 73,10 103,18 @@ No one understands how any specific neural network (like Mozilla Deep Voice) wor

### Circuitry

Hidden Markov Models, finite-state automatons, & ngram models can all be viewed as variants of a probability graph. To traverse these we need extensive multiplication, addition, and random-access memory lookups.
Hidden Markov Models, finite-state automatons, & ngram models can all be viewed
as variants of a probability graph. To traverse these we need extensive
multiplication, addition, and random-access memory lookups.

Additions and multiplications can be combined into a matrix multiplication operation, which are also heavily used in neural networks and optimized for by GPU & KPU hardware. Maybe this could be reused to perform the audio analysis?
Additions and multiplications can be combined into a matrix multiplication
operation, which are also heavily used in neural networks and optimized for by
GPU & KPU hardware. Maybe this could be reused to perform the audio analysis?

Random-access memory lookups meanwhile require some sort of RAM which is hard to optimize. But because having too large (or too small) of a language model gives a bad UX, it would be appropriate to heavily limit the available memory. And make heavier use of matrix multiplies then CMU Sphinx might.
Random-access memory lookups meanwhile require some sort of RAM which is hard to
optimize. But because having too large (or too small) of a language model gives
a bad UX, it would be appropriate to heavily limit the available memory. And make
heavier use of matrix multiplies then CMU Sphinx might.

This happens to closely describe the [MAIX SoC](https://www.seeedstudio.com/sipeed) which may power some real Rhapsode hardware.
\ No newline at end of file
This happens to closely describe the [MAIX SoC](https://www.seeedstudio.com/sipeed)
which may power some real Rhapsode hardware.

M docs/Why?.md => docs/Why?.md +7 -3
@@ 1,6 1,10 @@
I wish to show that The Web can be more private, secure, accessable, and easier to author if it limited it's scope and drastically simplified. I do not aim to support highly-interactive "webapps", but rather keep the I/O model abstract enough that it can work pretty much anywhere.
I wish to show that The Web can be more private, secure, accessable, and easier
to author if it limited it's scope and drastically simplified. I do not aim to
support highly-interactive "webapps", but rather keep the I/O model abstract
enough that it can work pretty much anywhere.

As such I'm implementing my own browser engines, and making them modular enough that you can reuse it's components in other browser engines or other projects.
As such I'm implementing my own browser engines, and making them modular enough
that you can reuse it's components in other browser engines or other projects.

## Bibliography
* https://invidio.us/watch?v=fPFdV-Z69Lo


@@ 10,4 14,4 @@ As such I'm implementing my own browser engines, and making them modular enough 
* https://brutalist-web.design/
* https://mastodon.social/@tbernard/103889150137765427
* https://mstdn.io/@wolf480pl/103772675972092365
* https://media.libreplanet.org/u/libreplanet/m/who-s-afraid-of-spectre-and-meltdown/
\ No newline at end of file
* https://media.libreplanet.org/u/libreplanet/m/who-s-afraid-of-spectre-and-meltdown/