__
__
__
__
A number of Unix text-processing utilities let you search for, and in some cases change, text patterns rather than fixed strings. These utilities include the editing programs ed, ex, vi, and sed, the awk programming language, and the commands grep and egrep. Text patterns (formally called regular expressiions) contain normal characters mixed with special characters (called metacharacters).
—Arnold Robbins, sed & awk Pocket Reference
The ed text editor (also authored by Thompson) had regular expression support but could not be used on such a large amount of text, so Thompson excerpted that code into a standalone tool.
The original motivation was an analogue of grep (g/re/p) for substitution, hence "g/re/s". Foreseeing that further special-purpose programs for each command would also arise, such as g/re/d, McMahon wrote a general-purpose line-oriented stream editor, which became sed. The syntax for sed, notably the use of / for pattern matching, and s/// for substitution, originated with ed, the precursor to sed, which was in common use at the time, and the regular expression syntax has influenced other languages, notably ECMAScript and Perl.
The core ex commands which relate to search and replace are essential to vi. For instance, the ex command :%s/XXX/YYY/g replaces every instance of XXX with YYY, and works in vi too. The % means every line in the file. The 'g' stands for global and means replace every instance on every line (if it was not specified, then only the first instance on each line would be replaced).
1989, Brian Fox
Bash Reference Manual
maintainer's Bash page incl FAQ
BASH programming introduction
Type bash --version at the Bash command line to determine
what version you are in.
name | how to quit |
---|---|
ed | q <enter> |
emacs | ^x^c |
nano | ^x |
vi | :q <enter> |
8-bit codes, mapped to 256 different characters
American Standard Code for Information Interchange
usually the result of a need to increase the number of characters which can be encoded without breaking backward compatibility with an existing constraint. For example, with one byte (8 bits) per character, one can encode 256 possible characters; in order to encode more than 256 characters, the obvious choice would be to use two or more bytes per encoding unit, ... but such a change would break compatibility with existing systems and therefore might not be feasible at all.
Since the aim of a multibyte encoding system is to minimize changes to existing application software, some characters must retain their pre-existing single-unit codes, even while other characters have multiple units in their codes. The result is that there are three sorts of units in a variable-width encoding: singletons, which consist of a single unit, lead units, which come first in a multiunit sequence, and trail units, which come afterwards in a multiunit sequence. Input and display software obviously needs to know about the structure of the multibyte encoding scheme but other software generally doesn't need to know if a pair of bytes represent two separate characters or just one character.
...
UTF-8 makes it easy for a program to identify the three sorts of units, since they fall into separate value ranges.
...
The Unicode standard has two variable-width encodings: UTF-8 and UTF-16 (it also has a fixed-width encoding, UTF-32).
... in [UTF-8] singletons have the range 00-7F, lead units have the range C0-FD (now actually C2-F4...), and trail units have the range 80-BF. The lead unit also tells how many trail units follow: one after C2-DF, two after E0-EF and three after F0-F4.
UTF-8, most frequently used specification
The frustrating dilemma that researchers in this field encountered in the 1980s as they tried to develop universally interchangeable character encodings was that on the one hand, it seemed to be necessary to add more bits to accommodate additional characters. On the other hand, for the users of the relatively small character set of the Latin alphabet (who still constituted the majority of computer users at the time), those additional bits were a colossal waste of then-scarce and expensive computing resources (as they would always be zeroed out for such users).
The compromise solution that was eventually hit upon with Unicode, as further explained below, was to break the longstanding assumption (dating back to the old telegraph codes) that each character should always directly correspond to a particular pattern of encoded bits. Instead, characters would be first mapped to an intermediate stage in the form of abstract numbers known as code points. Those code points would then be encoded in a variety of ways and with various default numbers of bits per character (code units) depending upon context. To encode code points higher than the length of the code unit, such as above 256 for 8-bit units, the solution was to implement variable-width encodings where an escape sequence would signal that subsequent bits should be parsed as a higher code point.
In computing, a newline, also known as a line ending, end of line (EOL), or line break, is a special character or sequence of characters signifying the end of a line of text. The actual codes representing a newline vary across operating systems...
Two ways to view newlines... are that newlines terminate lines or that they separate lines.
OS | ASCII dec# | ASCII hex# | Symbol | most programming |
---|---|---|---|---|
Unix & Unix-like | 10 | 0a | LF | \n |
Windows | 13, 10 | 0d, 0a | CR/LF | \r\n |
Mac Classic (pre-OS X) | 13 | 0d | CR | \r |
history, terminology, model (1985)
HTTP (1991)If you're still using FTP, however, please consider switching to HTTP. FTP is a protocol designed for a different era -- these days everyone should be avoiding it for multiple reasons.
header | |
nav | |
section | aside |
article | |
footer |
POSH ("Plain Old Semantic HTML")
WHATWG Living Standard (first published in 2012)
HTML comment: <!--
-->
(single-line and multi-line)
(whitespace required after opening and before closing tag)
Empty elements can be "closed" in the opening tag like this:
<br />
HTML5 does not require empty elements to be closed. But if you want stricter validation, or you need to make your document readable by XML parsers, you should close all HTML elements.
https://html.spec.whatwg.org/multipage/text-level-semantics.html
The
tbody
element is used in conjunction with thethead
andtfoot
elementsBrowsers can use these elements to enable scrolling of the table body independently of the header and footer.
The
h1
–h6
elements are headings.The first element of heading content in an element of sectioning content represents the heading for that section. Subsequent headings of equal or higher rank start new (implied) sections, headings of lower rank start implied subsections that are part of the previous one. In both cases, the element represents the heading of the implied section.
Now that the World Wide Web Consortium seems to have taken down their website at w3.org, where do we look for a standard? The WHATWG (Web Hypertext Application Technology Working Group) has this page: spec.whatwg.org What about a CSS standard?
The DOM is a Mess (2009, jQuery creator John Resig)
DOM Level 3, the current release of the DOM specification, published in April 2004
DOM Level 4 is currently being developed.
Last Call Working Draft was released in February 2014.
The in-memory representation is known as DOM HTML, or the DOM for short.
There are various concrete syntaxes that can be used to transmit resources that use this abstract language, two of which are defined in this specification.
The first such concrete syntax is the HTML syntax. This is the format suggested for most authors. It is compatible with most legacy Web browsers. If a document is transmitted with the text/html MIME type, then it will be processed as an HTML document by Web browsers. This specification defines the latest HTML syntax, known simply as HTML.
The second concrete syntax is the XHTML syntax, which is an application of XML. When a document is transmitted with an XML MIME type, such as application/xhtml+xml, then it is treated as an XML document by Web browsers, to be parsed by an XML processor.
Let's say you open a web browser... and load a web page in it... Now, inside the browser, there is a window object. This object represents the browser window.
This window object has dozens of properties (members), the most important of them being the document object. The document object represents the web page that is currently loaded into the browser window.
A node is an object that is "connected" to other objects from the DOM tree.
The JavaScript programs that are bound to a web page have complete access to every node of the DOM tree. They can delete nodes, add new nodes, or just manipulate the properties of a node.
...
HTML describes the structure of a document. The browser parses HTML and constructs an internal representation of the elements of the document from it
This internal representation is the DOM, the Document Object Model. This is the basis for creating the actual visual representation of the website.
The DOM for a webpage differs from the page source HTML for various reasons, e.g.:
<table>
element requires (in the DOM) a <tbody>
child element. If the page source HTML doesn't have one, the browser will put one into the DOM for the page.
The browser will just insert that <tbody> for you. It will be there in the DOM, so you'll be able to find it with JavaScript and style it with CSS, even though it's not in your HTML.
Legacy DOM was limited in the kinds of elements that could be accessed. Form, link and image elements could be referenced with a hierarchical name that began with the root document object. A hierarchical name could make use of either the names or the sequential index of the traversed elements. For example, a form input element could be accessed as either document.formName.inputName or document.forms[0].elements[0].
The Legacy DOM enabled client-side form validation and the popular "rollover" effect.
CSS 2.1
CSS level 2 revision 1...went back and forth between Working Draft status and Candidate Recommendation status for many years... Candidate Recommendation...2004... it was finally published as a W3C Recommendation on 7 June 2011.
CSS 3
Unlike CSS 2, which is a large single specification defining various features, CSS 3 is divided into several separate documents called "modules"... The earliest CSS 3 drafts were published in June 1999.
Due to the modularization, different modules have different stability and statuses. As of June 2012, there are over fifty CSS modules published from the CSS Working Group, and four of these have been published as formal Recommendations:
- 2012-06-19: Media Queries
- 2011-09-29: Namespaces
- 2011-09-29: Selectors Level 3
- 2011-06-07: Color
Some modules (including Backgrounds and Borders and Multi-column Layout among others) have Candidate Recommendation (CR) status and are considered moderately stable.
CSS 4
Because CSS3 split the CSS language's definition into modules, the modules have been allowed to level independently. Most modules are level 3... A few level-4 modules exist (such as Image Values, Backgrounds & Borders, or Selectors), which build on the functionality of a preceding level-3 module. Other modules defining entirely new functionality, such as Flexbox, have been designated as "level 1".
So, although no monolithic CSS4 will be worked on after CSS3 is finished completely, the level 4 modules can collectively be referred to as CSS4.
*
[ ]
.
#
>
+
~
div * p
selects any p
element descendent from a descendant of a div
element.
CSS will typically style a tag using the most specific DOM description that can be applied to it.
selector(s) | {declaration(s)} |
---|---|
h1, h2 |
{color: maroon; text-align: center;} |
comment: /* */
(single-line and multi-line)
table
, th
, td
elements)
to center a block element having width set to < 100%: margin: auto
relationships between 'display', 'position', and 'float'
Tip: When aligning elements with position
or with float
, always define margin and padding for the body element. This is to avoid visual differences in different browsers.
width's big caveat: the box model adds padding and border-width
solution:
use the experimental new box-sizing: border-box;
Since small is pretty new, you should use the -webkit- and -moz- prefixes for now
Inline elements: respect left & right margins and padding, but not top & bottom; cannot have a width and height set; allow other elements to sit to their left and right.
Inline-block elements: allow other elements to sit to their left and right; respect top & bottom margins and padding; respect height and width
Block elements: respect all of those; force a line break after the block element
The CSS3 Flexible Box, or flexbox, is a layout mode providing for the arrangement of elements on a page such that the elements behave predictably when the page layout must accommodate different screen sizes and different display devices. For many applications, the flexible box model provides an improvement over the block model in that it does not use floats, nor do the flex container's margins collapse with the margins of its contents.
The defining aspect of the flex layout is the ability to alter its items' width and/or height to best fill the available space on any display device. A flex container expands items to fill available free space, or shrinks them to prevent overflow.
The flexbox layout algorithm is direction-agnostic as opposed to the block layout, which is vertically-biased, or the inline layout, which is horizontally-biased.
the average web site has become way overtooled, which exacts a price when it comes to speed.
In computer science, ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably.
A tuple is a finite ordered list of elements.- wikipedia.org
In mathematics, a set is a collection of distinct objects...
... two sets are equal (one and the same) if and only if every element of each set is an element of the other.
What, then, is an attribute? - an attribute name plus its associated value within a particular tuple?
Data types (declared in CREATE TABLE) included in ANSI SQL: Character string CHAR(n) / VARCHAR(n) Bit string BIT(n) / BIT VARYING(n) (array of bits) Number INTEGER / SMALLINT / BIGINT FLOAT / REAL / DOUBLE PRECISION NUMERIC(precision, scale) or DECIMAL(precision, scale) e.g. 123.45 has precision 5, scale 2 Date and time Language elements • Clauses • Expressions can produce either scalar values, or tables • Predicates WHERE ___ • Queries • Statements terminated by ; • Insignificant whitspace *********************** DDL *********************** CREATE/DROP DATABASE CREATE/DROP TABLE | VIEW NOT NULL UNIQUE PRIMARY KEY FOREIGN KEY RENAME TABLE ? ALTER TABLE ************** Transaction controls ************** Transactions, if available, wrap DML operations: START TRANSACTION SAVE TRANSACTION COMMIT ROLLBACK *********************** DML *********************** INSERT INTO [table] ([columns]) VALUES (' ', ' ', ' ') / VALUES ( , , ) UPDATE SET ..=' ' WHERE ..=' ' / SET ..=.. WHERE .. MERGE ? DELETE FROM .. WHERE TRUNCATE TABLE SELECT .. FROM SELECT DISTINCT .. FROM (otherwise it can return a non-relation) SELECT is the most complex statement in SQL, with optional keywords and clauses that include: examples: SELECT * FROM Persons WHERE FirstName LIKE 'a%' AND/OR .. SELECT * FROM Persons WHERE FirstName='Peter' AND LastName="Jackson' SELECT * FROM Persons WHERE FirstName BETWEEN 'Hansen' AND 'Petterson' ORDER BY Salary DESC SELECT .. .. GROUP BY .. HAVING SELECT .. .. [NATURAL/INNER/OUTER]? JOIN .. ON SELECT COUNT(*) FROM Persons operators: BETWEEN..AND, LIKE..% functions: AVG, COUNT, SUM, .. TRUNC/ROUND rounds numerics or dates (in Postgres and various DBMSs) *********************** DCL *********************** GRANT REVOKE *********************** ext *********************** Procedural extensions SQL is designed for a specific purpose: to query data contained in a relational database. SQL is a set-based, declarative programming language, not an imperative programming language like C or BASIC. However, extensions to Standard SQL add procedural programming language functionality, such as control-of-flow constructs. These include: Source Common name Full name ANSI/ISO Standard SQL/PSM SQL/Persistent Stored Modules PostgreSQL PL/SQL Procedural Language/SQL (based on Ada) PostgreSQL PL/PSM Proc Lang/Pers St Mod In addition to the standard SQL/PSM extensions and proprietary SQL extensions, procedural and object-oriented programmability is available on many SQL platforms via DBMS integration with other languages. ... PostgreSQL lets users write functions in a wide variety of languages‚ including Perl, Python, Tcl, and C.
SQL deviates in several ways from its theoretical foundation, the relational model and its tuple calculus. In that model, a relation is a set of tuples, while in SQL, tables and query results are lists of rows: the same row may occur multiple times, and the order of rows can be employed in queries (e.g. in the LIMIT clause).
Modular arithmetic used to be something that every programmer encountered because it is part of the hardware of every machine. You find it in the way numbers are represented in binary and in machine code or assembly language instructions.
Once you get away from the representation of numbers as bit strings and arithmetic via registers then many mod and remainder operations lose their immediate meaning so familiar to assembly language programmers.
As soon as you start implementing even the simplest of algorithms the need to understand mod will occur.
Learn regular expressions in about 55 minutes
man grep good coverage of RE
.
string concatenation operator
x
string repition operator
various ways to concatenate:
$filename = "/tmp/${name}.tmp";
$filename = '/tmp/' . $name . '.tmp'
$filename = join('', '/tmp/' ,$name, '.tmp')
#
single-line comment
Unlike many programming languages Perl does not currently implement true multiline comments. This, and the workarounds that are in common use can be problematic. This could be solved by adding a new syntax...
The alternatives available to Perl programmers however are neither easy (comment each and every line of code individually..., nor consistent (use some other language feature to simulate multiline comments, like POD [Plain Old Documentation]).
What's Wrong With POD?
- it's not intuitive
- it's not documentation
... most of the time comments are not intended to be included in the documentation (separate from the code). This is particularly true when using comments as a debugging tool.- it doesn't encourage consistency
learn.perl.org
perlmonks.org
stackoverflow.com
Perl, by default, does not have any specific tokens to specify multiline comments. Nevertheless, there are several workarounds to accomplish this.
The easiest solution is to use the POD [Plain Old Documentation] system, enclosing the comments between 2 lines: the first line should begin with '=comment' and the last line should begin with '=cut'.
...
Perl has rather inflexible and limited comments. The desire to preserve compatibility with shell languages dictated the use of #, but here Perl robbed itself of [an] important symbol (that can be used, for example for casting scalars into numeric).
__
.
string concatenation operator
//
or #
single-line comment
/*
to */
multi-line comment
"use strict";
window properties: global variables global objects innerWidth, innerHeight document, location, history, screen, navigator* - document.getElementById("abc") document.getElementsByTagName("h2") document.getElementsByClass("class1") document.getElementsByName("name1") document.querySelectorAll() ? - location.href/hostname/pathname/protocol/assign(load doc) - history.back()/forward() - screen.width/height/availWidth/availHeight/colorDepth=pixelDepth - navigator.appName/appCodeName/platform/cookieEnabled/product (engine)/appVersion/userAgent(v detail)/language/javaEnabled() * different browsers can use the same name * navigator data can be changed by browser owner * some browsers misidentify themselves to bypass site tests * browsers cannot report new operating systems (released later) methods: global functions open(), close(), moveTo(), resizeTo() alert(), confirm(), prompt() setInterval(), clearInterval(), setTimeout(), clearTimeout() (window properties and methods can be written without the window. prefix) document.cookie JS can create, read, and delete cookies with this property. document.cookie="username=John Doe; expires=Thu, 18 Dec 2013 12:00:00 UTC; path=/"; document.cookie = "username=; expires=Thu, 01 Jan 1970 00:00:00 UTC"; path=; document.cookie will return all cookies in one string much like: cookie1=value; cookie2=value; cookie3=value; There are no official standards for the Browser Object Model (BOM). Since modern browsers have implemented (almost) the same methods and properties for JavaScript interactivity, it is often referred to, as methods and properties of the BOM. [- w3schools.org] https://www.youtube.com/watch?v=dgI52y27O_I 49:00- top-down CSS selector 51:00- bottom-up - how CSS selectors are implemented in browsers (CSS selector engines) -- sometimes this style is faster
JS types are "dynamic"--the same variable can be used as different types
<value> | typeof <value> |
---|---|
true | boolean |
1.11 | number |
Math.PI | number |
Number.MAX_VALUE | number |
"1.1" | string |
Date() | string |
new Date() | object |
{name:"Joe", gender:"m"} | object |
[1,2,"abc"] | object |
eval | function |
undefined | undefined |
null | object ?! |
;
at the end of each simple statement;
at the end of object, date object, array definitions//
/* */
myFunction()
and window.myFunction()
is the same function.this
, is the object that "owns" the JavaScript code.this
object-based
runtime evaluation (eval
function)
objects are prototype-based (classless; instance-based)
supports functional programming (declarative; referential transparency)
learn about:
abstract, arguments, byte, char, class*, const, delete, eval, export*, extends*, final, implements, import*, instanceof, int, interface, let, native, package, private, protected, public, static, super*, synchronized, this, transient, void, volatile, with, yield
+
string concatenation and number addition operator
ECMAScript®2015 Language Specification
a program that doesn't need pre-processing (e.g. compiling) before being run.- w3.org
JavaScript and Java are completely different languages, both in concept and design. JavaScript was invented by Brendan Eich in 1995, and became an ECMA standard in 1997. ECMA-262 is the official name. ECMAScript 5 (JavaScript 1.8.5 - July 2010) is the current standard.
The World Wide Web Consortium (W3C), founded in 1994 to promote open standards for the World Wide Web, brought Netscape Communications and Microsoft together with other companies to develop a standard for browser scripting languages, called ECMAScript. The first version of the standard was published in 1997.
print("Content-type: text/html\n\n");