Content Editable is a Scary Place

2016‑12‑12

Once in a while, people want to make part of a web page editable. They hear about the HTML contentEditable attribute, and wonder if this may be a solved problem. It is not. It is a minefield.

Whole books could probably be written about this, but here’s a little primer about how things are today, why it is a hard problem, and how there’s hope that it is going to get better.

TL;DR: If you want to use contenteditable now, don’t do it directly and instead use a pre-made javascript editor, such as CKEditor, tiny MCE, and the like. If they don’t do what you want, and you need to do this yourself now, be prepared for a lot of pain, or for waiting for newer standards to stabilise, or both.

Now let’s dive in.

Contenteditable is an attempt at having a high level construct that would enable rich text editing in web pages, letting browsers do all the heavy lifting, and letting the user (via typing, keyboard shortcuts, contextual menus…) or the javascript (via invocations of execCommand) just ask for these things to happen.

There are a ton of entangled reasons why this is complex, but just to get a sense of it, here is a contrived example. You can try playing with it here but I encourage you to think through it before trying:

<div contenteditable=true>
	<table id=t1>
		<tr><td>lorem <td>ipusm <td>dolor
		<tr><td>si <td>amet <td>consectetur
		<tr><td>adipiscing <td>elit <td>Quisque
	</table>
	<table id=t2>
		<tr><td>elem <td>constur <td>sem <td>et <td>supit
		<tr><td>poror <td>faubus <td>tindunt <td>Pheus <td>aliam
		<tr><td>ecitur <td>pesque <td>Maenas <td>ex <td>liga
		<tr><td>soidin <td>codo <td>Mis <td>sotun <td>dissim
	</table>
</div>
<ol> <li>1 <li>2 <li>3 <li>4 <li>5 <li>6 <li>7 </ol>

table { border-collapse: collapse; }

#t1 td{
	background: red;
	font-family: serif
}

#t2 { border: dashed 5px gray; }
#t2 td {
	font-family: sans-serif;
	border: solid 1px black;
	font-weight: bold;
}
#t2 tr { background: #bbffbb; }
#t2 tr:nth-of-type(2n) { background: #ccccff; }
#t2 td:first-child { user-select: none; }

body > ol {
	font-family: monospace;
	border: dotted orange;
}

Got that? Now the user creates a selection that goes from the last cell of the last row of the first table to the second cell in the first row of the second table. Then they press “a” on the keyboard. Generally, selecting something and then typing means replacing the selection with what was typed, but in this case, what does that mean?

Should the browser merge the two tables?
If not, which of the two tables does the “a” go into, and which of the table cells? What happens to the other cells? Are they deleted, or do they still exist but their content is deleted? Or do they get merged using colspan?
If you do merge the two tables, how do you do that? Naively remove the markup that corresponds to the selection? Make it into a 5 column table? 8 columns? What happens to the alternating background color? Does anything depends on whether the tables were laid out below each other (display: table) or besides each other (display: inline-table)? What happens to the borders?
Did it make a difference if the first cell of the second table was styled with user-select: none? What about if were contenteditable=false instead?
What font and font-weight shall the inserted “a” use?
What if instead of typing “a” you try and paste from the clipboard after copying the 3rd to 7th items of the list? Does it affect whether the tables get merged? Do you preserve the numbering? What background do you get? Do you get a border? How about the font? Does the same thing happen if you copy it from one browser (e.g. Firefox) and paste into another one (e.g. Chrome)?
Would it make a difference if the styles were inline instead of cascaded?
…

There’s a million subtleties like this, many of which don’t have an obvious correct answer, as it depends what you’re trying to do.

The end result is that browsers are full of bugs and are inconsistent with each other, that the specs (ContentEditableTrue and execCommand) don’t cover all the cases and aren’t followed particularly closely by the browsers anyway. Even if that was solved and everybody harmonised on one behaviour (which isn’t happening, as browsers have mostly given up), it still wouldn’t be good enough, because as a user maybe that harmonised behaviour is not the one you wanted, and now you want a separate method or way to opt into that alternative behavior.

So web-based editors (CKEditor, TinyMCE, google docs…) go to great lengths to work around contenteditable, instead of using it. For example they do live DOM diffing, to try and figure out what contenteditable did to the document and for what reason, undo it, and do it again in a different way.

So we come to plan B.

What people are working on now (with Johannes as a spec editor) is a completely different approach, where the browser does not do the heavy lifting, and instead, just provides events to inform a javascript based editor about what it is that the user is trying to do, and APIs to facilitate doing that.

Step 1 in that story (which is reasonably far along) is to make sure that everything that would cause a change in a contenteditable element fires a Javascript event before that change occurs, which:

informs the javascript about the user intent
allows the javascript to cancel the behavior the browser was about to provide, ensuring that nothing is changed in the contenteditable

Step 2 in that story is to provide multiple modes of contenteditable, where contenteditable=true is the one we know today, kept for legacy reasons, but other contenteditable=[something else than true or false] provide modes where all the events described in step 1 still fire, the insertion caret is still drawn, but depending on the mode, some of the events do not have a default action provided by the browser, and unless js reacts to them, nothing happens at all.

contentEditable=false:
- the element is not editable
contentEditable=events:
- the caret is drawn
- the events fire
- nothing happens unless js reacts to the events
contentEditable=caret:
- the caret is drawn
- the caret can be moved by the user
- the events fire
- nothing else happens unless js reacts to the events
contentEditable=typing:
- the caret is drawn
- the caret can be moved by the user
- the events fire
- the user typing something when nothing is selected will insert text
- the user attempting to move the caret will move the caret
- IME-based composition of text works
- nothing else happens unless js reacts to the events:
  - deletion (including the cut part of cut and paste) does nothing but fire an event
  - the paste part of cut/copy and paste does nothing but fire an event
  - replacement (select something then type) does nothing but fire an event
  - formatting commands (Ctrl+B to make something bold) does nothing but fire an event
  - …
contentEditable=true:
- the caret is drawn
- the caret can be moved by the user
- the events fire
- the browser has—and unless cancelled, applies—a default behaviour for all events