Cafehack Group: May 2009

Thursday, May 28, 2009

Scripting the Vim editor, Part 1: Variables, values, and expressions

Start with the basic elements of Vimscript

Damian Conway, Dr. (damian@conway.org), CEO and Chief Trainer, Thoughtstream

Summary: Vimscript is a mechanism for reshaping and extending the Vim editor. Scripting allows you to create new tools, simplify common tasks, and even redesign and replace existing editor features. This article (the first in a series) introduces the fundamental components of the Vimscript programming language: values, variables, expressions, statements, functions, and commands. These features are demonstrated and explained through a series of simple examples.

A great text editor

There's an old joke that Emacs would be a great operating system if only it had a decent text editor, whereas vi would be a great text editor if only it had a decent operating system. This gag reflects the single greatest strategic advantage that Emacs has always had over vi: an embedded extension programming language. Indeed, the fact that Emacs users are happy to put up with RSI-inducing control chords and are willing to write their extensions in Lisp shows just how great an advantage a built-in extension language must be.

But vi programmers no longer need cast envious glances towards Emacs' parenthetical scripting language. Our favorite editor can be scripted too—and much more humanely than Emacs.

In this series of articles, we'll look at the most popular modern variant of vi, the Vim editor, and at the simple yet extremely powerful scripting language that Vim provides. This first article explores the basic building blocks of Vim scripting: variables, values, expressions, simple flow control, and a few of Vim's numerous utility functions.

I'll assume that you already have access to Vim and are familiar with its interactive features. If that's not the case, some good starting points are Vim's own Web site and various online resources and hardcopy books, or you can simply type :help inside Vim itself. See the Resources section for links.

Unless otherwise indicated, all the examples in this series of articles assume you're using Vim version 7.2 or higher. You can check which version of Vim you're using by invoking the editor like so:

vim --version

or by typing :version within Vim itself. If you're using an older incarnation of Vim, upgrading to the latest release is strongly recommended, as previous versions do not support many of the features of Vimscript that we'll be exploring. The Resources section has a link to download and upgrade Vim.

Vimscript

Vim's scripting language, known as Vimscript, is a typical dynamic imperative language and offers most of the usual language features: variables, expressions, control structures, built-in functions, user-defined functions, first-class strings, high-level data structures (lists and dictionaries), terminal and file I/O, regex pattern matching, exceptions, and an integrated debugger.

You can read Vim's own documentation of Vimscript via the built-in help system, by typing:

:help vim-script-intro

inside any Vim session. Or just read on.

Running Vim scripts

There are numerous ways to execute Vim scripting commands. The simplest approach is to put them in a file (typically with a .vim extension) and then execute the file by :source-ing it from within a Vim session:

:source /full/path/to/the/scriptfile.vim

Alternatively, you can type scripting commands directly on the Vim command line, after the colon. For example:

:call MyBackupFunc(expand('%'), { 'all':1, 'save':'recent'})

But very few people do that. After all, the whole point of scripting is to reduce the amount of typing you have to do. So the most common way to invoke Vim scripts is by creating new keyboard mappings, like so:

:nmap ;s :source /full/path/to/the/scriptfile.vim :nmap \b :call MyBackupFunc(expand('%'), { 'all': 1 })

Commands like these are usually placed in the .vimrc initialization file in your home directory. Thereafter, when you're in Normal mode (in other words, not inserting text), the key sequence ;s will execute the specified script file, and a \b sequence will call the MyBackupFunc() function (which you presumably defined somewhere in your .vimrc as well).

All of the Vimscript examples in this article use key mappings of various types as triggers. In later articles, we'll explore two other common invocation techniques: running scripts as colon commands from Vim's command line, and using editor events to trigger scripts automatically.

A syntactic example

Vim has very sophisticated syntax highlighting facilities, which you can turn on with the built-in :syntax enable command, and off again with :syntax off.

It's annoying to have to type ten or more characters every time you want to toggle syntax highlighting, though. Instead, you could place the following lines of Vimscript in your .vimrc file:

Listing 1. Toggling syntax highlighting

function! ToggleSyntax()
  if exists("g:syntax_on")
     syntax off
  else
     syntax enable
  endif
endfunction

nmap   ;s  :call ToggleSyntax()

This causes the ;s sequence to flip syntax highlighting on or off each time it's typed when you're in Normal mode. Let's look at each component of that script.

The first block of code is obviously a function declaration, defining a function named ToggleSyntax(), which takes no arguments. That user-defined function first calls a built-in Vim function named exists(), passing it a string. The exists() function determines whether a variable with the name specified by the string (in this case, the global variable g:syntax_on) has been defined.

If so, the if statement executes a syntax off; otherwise it executes a syntax enable. Because syntax enable defines the g:syntax_on variable, and syntax off undefines it, calling the ToggleSyntax() function repeatedly alternates between enabling and disabling syntax highlighting.

All that remains is to set up a key sequence (;s in this example) to call the ToggleSyntax() function:

nmap ;s :call ToggleSyntax()

nmap stands for "normal-mode key mapping." The option after the nmap causes the mapping not to echo any command it's executing, ensuring that the new ;s command will do its work unobtrusively. That work is to execute the command:

:call ToggleSyntax()

which is how you call a function in Vimscript when you intend to ignore the return value.

Note that the at the end is the literal sequence of characters <,C,R,>. Vimscript recognizes this as being equivalent to a literal carriage return. In fact, Vimscript understands many other similar representations of unprintable characters. For example, you could create a keyboard mapping to make your space bar act like the page-down key (as it does in most Web browsers), like so:

:nmap

You can see the complete list of these special symbols by typing :help keycodes within Vim.

Note too that ToggleSyntax() was able to call the built-in syntax command directly. That's because every built-in colon command in Vim is automatically also a statement in Vimscript. For example, to make it easier to create centered titles for documents written in Vim, you could create a function that capitalizes each word on the current line, centers the entire line, and then jumps to the next line, like so:

Listing 2. Creating centered titles

function! CapitalizeCenterAndMoveDown()
  s/\<./\u&/g   "Built-in substitution capitalizes each word
  center        "Built-in center command centers entire line
  +1            "Built-in relative motion (+1 line down)
endfunction

nmap   \C  :call CapitalizeCenterAndMoveDown()

Vimscript statements

As the previous examples illustrate, all statements in Vimscript are terminated by a newline (as in shell scripts or Python). If you need to run a statement across multiple lines, the continuation marker is a single backslash. Unusually, the backslash doesn't go at the end of the line to be continued, but rather at the start of the continuation line:

Listing 3. Continuing lines using backslash

call SetName(
\             first_name,
\             middle_initial,
\             family_name
\           )

You can also put two or more statements on a single line by separating them with a vertical bar:

echo "Starting..." | call Phase(1) | call Phase(2) | echo "Done"

That is, the vertical bar in Vimscript is equivalent to a semicolon in most other programming languages. Unfortunately, Vim couldn't use the semicolon, as that character already means something else at the start of a command (specifically, it means "from the current line to..." as part of the command's line range).

Comments

One important use of the vertical bar as a statement separator is in commenting. Vimscript comments start with a double-quote and continue to the end of the line, like so:

Listing 4. Commenting in Vimscript

if exists("g:syntax_on")
  syntax off      "Not 'syntax clear' (which does something else)
else
  syntax enable   "Not 'syntax on' (which overrides colorscheme)
endif

Unfortunately, Vimscript strings can also start with a double-quote and always take precedence over comments. This means you can't put a comment anywhere that a string might be expected, because it will always be interpreted as a string:

echo "> " "Print generic prompt

The echo command expects one or more strings, so this line produces an error complaining about the missing closing quote on (what Vim assumes to be) the second string.

Comments can, however, always appear at the very start of a statement, so you can fix the above problem by using a vertical bar to explicitly begin a new statement before starting the comment, like so:

echo "> " |"Print generic prompt

Values and variables

Variable assignment in Vimscript requires a special keyword, let:

Listing 5. Using the let keyword

let name = "Damian"

let height = 165

let interests = [ 'Cinema', 'Literature', 'World Domination', 101 ]

let phone     = { 'cell':5551017346, 'home':5558038728, 'work':'?' }

Note that strings can be specified with either double-quotes or single-quotes as delimiters. Double-quoted strings honor special "escape sequences" such as "\n" (for newline), "\t" (for tab), "\u263A" (for Unicode smiley face), or "\" (for the escape character). In contrast, single-quoted strings treat everything inside their delimiters as literal characters—except two consecutive single-quotes, which are treated as a literal single-quote.

Values in Vimscript are typically one of the following three types:

scalar: a single value, such as a string or a number. For example: "Damian" or 165
list: an ordered sequence of values delimited by square brackets, with implicit integer indices starting at zero. For example: ['Cinema', 'Literature', 'World Domination', 101]
dictionary: an unordered set of values delimited by braces, with explicit string keys. For example: {'cell':5551017346, 'home':5558038728, 'work':'?'}

Note that the values in a list or dictionary don't have to be all of the same type; you can mix strings, numbers, and even nested lists and dictionaries if you wish.

Unlike values, variables have no inherent type. Instead, they take on the type of the first value assigned to them. So, in the preceding example, the name and height variables are now scalars (that is, they can henceforth store only strings or numbers), interests is now a list variable (that is, it can store only lists), and phone is now a dictionary variable (and can store only dictionaries). Variable types, once assigned, are permanent and strictly enforced at runtime:

let interests = 'unknown' " Error: variable type mismatch

By default, a variable is scoped to the function in which it is first assigned to, or is global if its first assignment occurs outside any function. However, variables may also be explicitly declared as belonging to other scopes, using a variety of prefixes, as summarized in Table 1.

Table 1. Vimscript variable scoping

Prefix	Meaning
g: varname	The variable is global
s: varname	The variable is local to the current script file
w: varname	The variable is local to the current editor window
t: varname	The variable is local to the current editor tab
b: varname	The variable is local to the current editor buffer
l: varname	The variable is local to the current function
a: varname	The variable is a parameter of the current function
v: varname	The variable is one that Vim predefines

There are also pseudovariables that scripts can use to access the other types of value containers that Vim provides. These are summarized in Table 2.

Table 2. Vimscript pseudovariables

Prefix	Meaning
& varname	A Vim option (local option if defined, otherwise global)
*&l:* varname	A local Vim option
*&g:* varname	A global Vim option
@ varname	A Vim register
$ varname	An environment variable

The "option" pseudovariables can be particularly useful. For example, you could set up two key-maps to increase or decrease the current tabspacing like so:

nmap ]] :let &tabstop += 1 nmap [[ :let &tabstop -= &tabstop > 1 ? 1 : 0

Expressions

Note that the [[ key-mapping in the previous example uses an expression containing a C-like "ternary expression":

&tabstop > 1 ? 1 : 0

This prevents the key map from decrementing the current tab spacing below the sane minimum of 1. As this example suggests, expressions in Vimscript are composed of the same basic operators that are used in most other modern scripting languages, and with generally the same syntax. The available operators (grouped by increasing precedence) are summarized in Table 3.

Table 3. Vimscript operator precedence table

Operation	Operator syntax
Assignment Numeric-add-and-assign Numeric-subtract-and-assign String-concatenate-and-assign	let var = expr let var += expr let var -= expr let var .= expr
Ternary operator	bool ? expr-if-true : expr-if-false
Logical OR	bool \|\| bool
Logical AND	bool && bool
Numeric or string equality Numeric or string inequality Numeric or string greater-then Numeric or string greater-or-equal Numeric or string less than Numeric or string less-or-equal	expr == expr expr != expr expr > expr expr >= expr expr < expr expr <= expr
Numeric addition Numeric subtraction String concatenation	num + num num - num str . str
Numeric multiplication Numeric division Numeric modulus	num * num num / num num % num
Convert to number Numeric negation Logical NOT	+ num - num ! bool
Parenthetical precedence	( expr )

Logical caveats

In Vimscript, as in C, only the numeric value zero is false in a boolean context; any non-zero numeric value—whether positive or negative—is considered true. However, all the logical and comparison operators consistently return the value 1 for true.

When a string is used as a boolean, it is first converted to an integer, and then evaluated for truth (non-zero) or falsehood (zero). This implies that the vast majority of strings—including most non-empty strings—will evaluate as being false. A typical mistake is to test for an empty string like so:

Listing 6. Flawed test for empty string

let result_string = GetResult();

if !result_string
  echo "No result"
endif

The problem is that, although this does work correctly when result_string is assigned an empty string, it also indicates "No result" if result_string contains a string like "I am NOT an empty string", because that string is first converted to a number (zero) and then to a boolean (false).

The correct solution is to explicitly test strings for emptiness using the appropriate built-in function:

Listing 7. Correct test for empty string

if empty(result_string)
  echo "No result"
endif

Comparator caveats

In Vimscript, comparators always perform numeric comparison, unless both operands are strings. In particular, if one operand is a string and the other a number, the string will be converted to a number and the two operands then compared numerically. This can lead to subtle errors:

let ident = 'Vim' if ident == 0 "Always true (string 'Vim' converted to number 0)

A more robust solution in such cases is:

if ident == '0'   "Uses string equality if ident contains string
                 "but numeric equality if ident contains number

String comparisons normally honor the local setting of Vim's ignorecase option, but any string comparator can also be explicitly marked as case-sensitive (by appending a #) or case-insensitive (by appending a ?):

Listing 8. Casing string comparators

if name ==? 'Batman'         |"Equality always case insensitive
  echo "I'm Batman"
elseif name <# 'ee cummings' |"Less-than always case sensitive
  echo "the sky was can dy lu minous"
endif

Using the "explicitly cased" operators for all string comparisons is strongly recommended, because they ensure that scripts behave reliably regardless of variations in the user's option settings.

Arithmetic caveats

When using arithmetic expressions, it's also important to remember that, until version 7.2, Vim supported only integer arithmetic. A common mistake under earlier versions was writing something like:

Listing 9. Problem with integer arithmetic

"Step through each file...
for filenum in range(filecount)
  " Show progress...
  echo (filenum / filecount * 100) . '% done'

  " Make progress...
  call process_file(filenum)
endfor

Because filenum will always be less than filecount, the integer division filenum/filecount will always produce zero, so each iteration of the loop will echo:

Now 0% done

Even under version 7.2, Vim does only floating-point arithmetic if one of the operands is explicitly floating-point:

let filecount = 234

echo filecount/100   |" echoes 2
echo filecount/100.0 |" echoes 2.34

Another toggling example

It's easy to adapt the syntax-toggling script shown earlier to create other useful tools. For example, if there is a set of words that you frequently misspell or misapply, you could add a script to your .vimrc to activate Vim's match mechanism and highlight problematic words when you're proofreading text.

For example, you could create a key-mapping (say: ;p) that causes text like the previous paragraph to be displayed within Vim like so:

It's easy to adapt the syntax-toggling script shown earlier to create other useful tools. For example, if there is a set of words that you frequently misspell or misapply, you could add a script to your .vimrc to activate Vim's match mechanism and highlight problematic words when you're proofreading text.

That script might look like this:

Listing 10. Highlighting frequently misused words

"Create a text highlighting style that always stands out...
highlight STANDOUT term=bold cterm=bold gui=bold

"List of troublesome words...
let s:words = [
            \ "it's",  "its",
            \ "your",  "you're",
            \ "were",  "we're",   "where",
            \ "their", "they're", "there",
            \ "to",    "too",     "two"
            \ ]

"Build a Vim command to match troublesome words...
let s:words_matcher
\ = 'match STANDOUT /\c\<\(' . join(s:words, '\|') . '\)\>/'

"Toggle word checking on or off...
function! WordCheck ()
  "Toggle the flag (or set it if it doesn't yet exist)...
  let w:check_words = exists('w:check_words') ? !w:check_words : 1

  "Turn match mechanism on/off, according to new state of flag...
  if w:check_words
     exec s:words_matcher
  else
     match none
  endif
endfunction

"Use ;p to toggle checking...

nmap   ;p  :call WordCheck()

The variable w:check_words is used as a boolean flag to toggle word checking on or off. The first line of the WordCheck() function checks to see if the flag already exists, in which case the assignment simply toggles the variable's boolean value:

let w:check_words = exists('w:check_words') ? !w:check_words : 1

If w:check_words does not yet exist, it is created by assigning the value 1 to it:

let w:check_words = exists('w:check_words') ? !w:check_words : 1

Note the use of the w: prefix, which means that the flag variable is always local to the current window. This allows word checking to be toggled independently for each editor window (which is consistent with the behavior of the match command, whose effects are always local to the current window as well).

Word checking is enabled by setting Vim's match command. A match expects a text-highlighting specification (STANDOUT in this example), followed by a regular expression that specifies which text to highlight. In this case, that regex is constructed by OR'ing together all of the words specified in the script's s:words list variable (that is: join(s:words, '\|')). That set of alternatives is then bracketed by case-insensitive word boundaries (\c\<$...$\>) to ensure that only entire words are matched, regardless of capitalization.

The WordCheck() function then converts the resulting string as a Vim command and executes it (exec s:words_matcher) to turn on the matching facility. When w:check_words is toggled off, the function performs a match none command instead, to deactivate the special matching.

Scripting in Insert mode

Vimscripting is by no means restricted to Normal mode. You can also use the imap or iabbrev commands to set up key-mappings or abbreviations that can be used while inserting text. For example:

imap =strftime("%e %b %Y") imap =strftime("%l:%M %p")

With these mappings in your .vimrc, typing CTRL-D twice while in Insert mode causes Vim to call its built-in strftime() function and insert the resulting date, while double-tapping CTRL-T likewise inserts the current time.

You can use the same general pattern to cause an insertion map or an abbreviation to perform any scriptable action. Just put the appropriate Vimscript expression or function call between an initial = (which tells Vim to insert the result of evaluating what follows) and a final (which tells Vim to actually evaluate the preceding expression). Remember, though, that (Vim's abbreviation for CTRL-R) is not the same as (Vim's abbreviation for a carriage return).

For example, you could use Vim's built-in getcwd() function to create an abbreviation for the current working directory, like so:

iabbrev CWD =getcwd()

Or you could embed a simple calculator that can be called by typing CTRL-C during text insertions:

imap =string(eval(input("Calculate: ")))

Here, the expression:

string( eval( input("Calculate: ") ) )

first calls the built-in input() function to request the user to type in their calculation, which input() then returns as a string. That input string is then passed to the built-in eval(), which evaluates it as a Vimscript expression and returns the result. Next, the built-in string() function converts the numeric result back to a string, which the key-mapping's = sequence is then able to insert.

A more complex Insert-mode script

Insertion mappings can involve scripts considerably more sophisticated than the previous examples. In such cases, it's usually a good idea to refactor the code out into a user-defined function, which the key-mapping can then call.

For example, you could change the behavior of CTRL-Y during insertions. Normally a CTRL-Y in Insert mode does a "vertical copy." That is, it copies the character in the same column from the line immediately above the cursor. For example, a CTRL-Y in the following situation would insert an "m" at the cursor:

Glib jocks quiz nymph to vex dwarf Glib jocks quiz ny_

However, you might prefer your vertical copies to ignore any intervening empty lines and instead copy the character from the same column of the first non-blank line anywhere above the insertion point. That would mean, for instance, that a CTRL-Y in the following situation would also insert an "m", even though the immediately preceding line is empty:

Glib jocks quiz nymph to vex dwarf Glib jocks quiz ny_

You could achieve this enhanced behavior by placing the following in your .vimrc file:

Listing 11. Improving vertical copies to ignore blank lines

"Locate and return character "above" current cursor position...
function! LookUpwards()
  "Locate current column and preceding line from which to copy...
  let column_num      = virtcol('.')
  let target_pattern  = '\%' . column_num . 'v.'
  let target_line_num = search(target_pattern . '*\S', 'bnW')

  "If target line found, return vertically copied character...

  if !target_line_num
     return ""
  else
     return matchstr(getline(target_line_num), target_pattern)
  endif
endfunction

"Reimplement CTRL-Y within insert mode...

imap     =LookUpwards()

The LookUpwards() function first determines which on-screen column (or "virtual column") the insertion point is currently in, using the built-in virtcol() function. The '.' argument specifies that you want the column number of the current cursor position:

let column_num = virtcol('.')

LookUpwards() then uses the built-in search() function to look backwards through the file from the cursor position:

let target_pattern = '\%' . column_num . 'v.' let target_line_num = search(target_pattern . '*\S', 'bnW')

The search uses a special target pattern (namely: \%column_numv.*\S) to locate the closest preceding line that has a non-whitespace character (\S) at or after (.*) the cursor column (\%column_numv). The second argument to search() is the configuration string bnW, which tells the function to search backwards but not to move the cursor nor to wrap the search. If the search is successful, search() returns the line number of the appropriate preceding line; if the search fails, it returns zero.

The if statement then works out which character—if any—is to be copied back down to the insertion point. If a suitable preceding line was not found, target_line_num will have been assigned zero, so the first return statement is executed and returns an empty string (indicating "insert nothing").

If, however, a suitable preceding line was identified, the second return statement is executed instead. It first gets a copy of that preceding line from the current editor buffer:

return matchstr(getline(target_line_num), target_pattern)

It then finds and returns the one-character string that the previous call to search() successfully matched:

return matchstr(getline(target_line_num), target_pattern)

Having implemented this new vertical copy behavior inside LookUpwards(), all that remains is to override the standard CTRL-Y command in Insert mode, using an imap:

imap =LookUpwards()

Note that, whereas earlier imap examples all used = to invoke a Vimscript function call, this example uses = instead. The single-CTRL-R form inserts the result of the subsequent expression as if it had been directly typed, which means that any special characters within the result retain their special meanings and behavior. The double-CTRL-R form, on the other hand, inserts the result as verbatim text without any further processing.

Verbatim insertion is more appropriate in this example, since the aim is to exactly copy the text above the cursor. If the key-mapping used =, copying a literal escape character from the previous line would be equivalent to typing it, and would cause the editor to instantly drop out of Insert mode.

Learning Vim's built-in functions

As you can see from each of the preceding examples, much of Vimscript's power comes from its extensive set of over 200 built-in functions. You can start learning about them by typing:

:help functions

or, to access a (more useful) categorized listing:

:help function-list

Looking ahead

Vimscript is a mechanism for reshaping and extending the Vim editor. Scripting lets you create new tools (such as a problem-word highlighter) and simplify common tasks (like changing tabspacing, or inserting time and date information, or toggling syntax highlighting), and even completely redesign existing editor features (for example, enhancing CTRL-Y's "copy-the-previous-line" behavior).

For many people, the easiest way to learn any new language is by example. To that end, you can find an endless supply of sample Vimscripts—most of which are also useful tools in their own right—on the Vim Tips wiki. Or, for more extensive examples of Vim scripting, you can trawl the 2000+ larger projects housed in the Vim script archive. Both are listed in the Resources section below.

If you're already familiar with Perl or Python or Ruby or PHP or Lua or Awk or Tcl or any shell language, then Vimscript will be both hauntingly familiar (in its general approach and concepts) and frustratingly different (in its particular syntactic idiosyncrasies). To overcome that cognitive dissonance and master Vimscript, you're going to have to spend some time experimenting, exploring, and playing with the language. To that end, why not take your biggest personal gripe about the way Vim currently works and see if you can script a better solution for yourself?

This article has described only Vimscript's basic variables, values, expressions, and functions. The range of "better solutions" you're likely to be able to construct with just those few components is, of course, extremely limited. So, in future installments, we'll look at more advanced Vimscript tools and techniques: data structures, flow control, user-defined commands, event-driven scripting, building Vim modules, and extending Vim using other scripting languages. In particular, the next article in this series will focus on the features of Vimscript's user-defined functions and on the many ways they can make your Vim experience better.

Resources

Learn

To learn more about the Vim editor and its many commands, see:
- The Vim homepage
- The online book A Byte of Vim
- Various hardcopy books on Vim
- Vim's own manual
- Steve Oualline's Vim Cookbook
For more extensive examples of Vimscripting, see:
- The Vim Tips wiki
- The Vim script archive
In the developerWorks Linux zone, find more resources for Linux developers, and scan our most popular articles and tutorials.
See all Linux tips and Linux tutorials on developerWorks.
Stay current with developerWorks technical events and Webcasts.

Get products and technologies

Start at the Vim distributions downloads page to upgrade to the latest version of Vim for your platform.
With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.

Discuss

Get involved in the My developerWorks community; with your personal profile and custom home page, you can tailor developerWorks to your interests and interact with other developerWorks users.

About the author

Damian Conway is an Adjunct Associate Professor of Computer Science at Monash University, Australia, and CEO of Thoughtstream, an international IT training company. He has been a daily vi user for well over a quarter of a century, and there now seems very little hope he will ever conquer the addiction.

Cài đặt bộ gõ tiếng việt x-unikey trên Linux

(ST) Các bộ gõ Tiếng Việt trong thế giới chim cánh cụt nổi tiếng nhất là xvnkb & x-unikey. Trong khi xvnkb chạy khá tốt trong GNOME & XFCE nhưng trong KDE lại thường gây ra lỗi là không thể login vào XWindow được, Vả lại xvnkb còn gây lỗi không thể automount CD trong ubuntu được. Bên cạnh đó x-unikey lại gõ rất tốt trong KDE, không bị lỗi automount CD như xvnkb nhưng hay bị lỗi với OpenOffice trong GNOME, XFCE. Bài viết này sẽ hướng dẫn cách cài đặt x-unikey và khắc phục lỗi với openOffice nếu như sử dụng GNOME, XFCE

1. Đầu tiên hãy chắc rằng máy của bạn đã có hỗ trợ en_US.UTF-8 hoặc vi_VN.UTF-8 bằng cách vào terminal(hay konsole) gõ locale -a .Nếu như chưa có thì ta sẽ tạo mới-nên nhớ phải dùng quyền root (dùng sudo trước các command nếu như sử dụng ubuntu hoặc dùng su nếu sử dụng distro khác)
mkdir /usr/share/locale/en_US.UTF-8
localedef -v -ci en_US -f UTF-8 /usr/share/locale/en_US.UTF-8 (tạo địa phương en_US.UTF-8 )
hoặc:
mkdir /usr/share/locale/vi_VN.UTF-8
localedef -v -ci vi_VN -f UTF-8 /usr/share/locale/vi_VN.UTF-8 (tạo địa phương vi_VN.UTF-8 )
2. vào http://unikey.org/linux.php . Tại đây bạn có thể lựa chọn download soursecode hoặc cái gói DEB hoặc RPM tương ứng

Nếu bạn chọn gói DEB, dùng quyền root gõ dpkg -i filename.deb
Nếu bạn chọn gói RPM, dùng quyền root gõ rpm -i filename.rpm
Nếu bạn thích cài đặt từ soursecode: giải nén, chuyển vào thư mục vừa giải nén, gõ các lệnh:

./configure
make
make install (phải dùng quyền root)

3. Bây giờ bạn hãy vào thư mục ~ (thư mục /home/tên_của_bạn) mở file .bash_profile (chú ý đây là file ẩn) thêm vào các dòng sau:
export XMODIFIERS=”@im=unikey”
export GTK_IM_MODULE=”xim”
export LANG=en_US.UTF-8 (hoặc export LANG=vi_VN.UTF-
export LC_CTYPE=en_US.UTF-8 (hoặc export LC_CTYPE=vi_VN.UTF-8 ) Xong, bạn thử logout rồi login trở lại là gõ được tiếng việt
Cách khắc phục lỗi gõ tiếng việt ở OpenOffice chạy trên GNOME, XFCE

Nếu như trong OpenOffice, bạn gõ dấu tiếng việt nhưng chỉ ra các số thứ tự thì cách khắc phục như sau:
mở file options trong thư mục ~/.unikey tìm đến dòng CommitMethod = Send và sửa thành CommitMethod = Forward. Login lại bạn sẽ gõ được tiếng việt trong OpenOffice. Tuy nhiên để Forward gõ tiếng việt không tốt bằng Send, bạn cứ thử xem thế nào, nếu không thích thì cứ để Send rồi khi cần gõ trong OpenOffice thì chuyển qua Forward.

————————————————————————————–

(ST) Vừa qua các tín đồ Ubuntu ở Việt Nam than phiền khá nhiều về gõ tiếng Việt trên phiên bản Hardy Heron. Vấn đề là cài bộ gõ xvnkb lên không hoạt động được. Ở đây tôi xin chia sẻ một số kinh nghiệm của mình về vấn đề này.
Để gõ tiếng Việt trên Ubuntu Hardy, tôi đã thử hai cách, là dùng bộ gõ SCIM và bộ gõ XVNKB.

Đầu tiên, ta kích hoạt hỗ trợ nhập liệu tiếng Việt bằng cách cài thêm các gói sau:

sudo apt-get install language-pack-vi language-support-input-vi

Hoặc cách khác là vào menu System> Administrator> Language Support. Nó bảo cài thêm cái gì thì cài thêm cái đó. Xong xuôi hiện ra một danh sách các ngôn ngữ được hỗ trợ. Kéo xuống gần cuối ta sẽ thấy Vietnamese. Đánh dấu check vào đấy. Ở cuối hộp thoại, đánh dấu chọn vào Enable support to enter complex character. Nhấn Apply để chương trình hoàn tất các thiết lập. Xong khởi động lại máy.
Lúc này sau khi login vào, ở góc trên bên phải, phần chứa các tray icon sẽ xuất hiện một biểu tượng cái bàn phím nhỏ. Đó là icon của SCIM. Bây giờ bạn click chuột trái vào đó, chọn chế độ input cho Vietnamese (telex, vni …) là có thể gõ tiếng Việt được rồi.
Để gõ được tiếng Việt bằng xvnkb, ta cũng phải qua các bước trên. Sau đó tiếp tục cài đặt xvnkb (bản fix lỗi có thể tải ở đây).
Trước tiên cài gói ld.so.preload-manager:

sudo apt-get install ld.so.preload-manager

Tiếp theo cài gói xvnkb

sudo dpkg -i xvnkb_0.3-1ubuntu710_i386.deb

Sau khi cài đặt, khởi động xvnkb bằng cách gõ lệnh xvnkb. Tuy nhiên lúc này để xvnkb có tác dụng thì bạn phải tắt SCIM đi, bằng cách nhấp chuột phải lên biểu tượng SCIM và chọn Exit. Khi cài cùng lúc hai bộ gõ như thế này, SCIM sẽ được ưu tiên khởi động và sử dụng hơn xvnkb.
Bây giờ ta đã có thể gõ tiếng Việt với SCIM và xvnkb. Vấn đề còn laị là tôi chưa giải quyết được là làm sao tắt SCIM đi, không cho khởi động tự động mà vẫn gõ bằng xvnkb được (chỉ là vấn đề sở thích, còn thì tôi thấy SCIM gõ tiếng Việt cũng tốt), mong mọi người cùng tìm giải pháp.

Sunday, May 24, 2009

Creating Filesystems Using TimeStorm LDS

Introduction

One of the most significant aspects of the power and flexibility of Linux is the fact that it is a true multi-tasking operating system, not merely a complex program loader linked with a single application. This enables embedded Linux applications to display greater flexibility and responsiveness to a variety of conditions than the simpler execution environments provided by traditional, proprietary embedded operating systems.

This white paper begins with overviews of the filesystems used during the boot process on most embedded Linux systems, the Linux system startup process, and the relationships between the two. Subsequent sections discuss system requirements for successful system startup and program execution, focusing on how powerful software such as TimeSys Linux Development Suite (LDS) delivers Linux expertise in an easy-to-use, graphical environment that can reduce the time and effort required to create and manage a deployable, right-sized embedded Linux system containing customized applications.

Filesystems and the Linux Startup Process

The core of the Linux operating system is known as the kernel. When an embedded Linux system boots, the kernel is loaded into memory from a device that an embedded system's boot monitor can access, and then executed. The kernel automatically probes, identifies, and initializes as much of your system's hardware as possible, and then looks for an initial filesystem that it can access and load and run applications from in order to continue the boot process. The first filesystem mounted by Linux systems during the boot process is known as a root filesystem because it is automatically mounted at the Linux directory '/', which is the base of the hierarchical Linux filesystem. Once mounted, the root filesystem provides the Linux system with a basic directory structure that it can use to map devices to Linux device nodes, access those devices, and locate, load, and execute subsequent code such as system code or your custom applications.

Linux supports a wider range of filesystems than any other operating system. These are easily separated into three general classes in the context of the Linux boot process. These classes include:

Filesystems that are located on local devices which are directly connected to your embedded hardware

Network filesystems using standard protocols such as NFS

A special in-memory filesystem known as an initial RAM disk

The types of root filesystems that your embedded Linux system supports during the boot process depend on the types of filesystems that are supported by the kernel that you are booting.

Initial RAM Disks and Linux

Initial RAM disks are actually compressed images of other types of Linux filesystems. An initial RAM disk provides a filesystem from which the kernel can execute programs and load kernel modules as part of the boot process. The former enables you to perform administrative tasks such as device initialization, checking the consistency of any local storage devices, and so on before actually using or mounting those devices. The latter enables you to minimize the number of device drivers that you have to build into your kernel, because device drivers can be loaded as modules based on the type of hardware that is detected by the kernel's device probing and initialization routines. Initial RAM disks are therefore commonly used on desktop Linux systems which may be deployed on a wide range of different hardware. This aspect of an initial RAM disk is less important in embedded Linux systems, because the types of attached hardware rarely change.

Many Linux distributions, such as the Linux 2.6 Reference Distributions available from TimeSys, include pre-assembled initial RAM disks for supported platforms and architectures. Desktop Linux systems typically use the EXT2 filesystem in their initial RAM disks, but many embedded Linux systems use smaller, more simple types of filesystems such as CRAMFS, ROMFS, or even the Minix filesystem. Regardless of the type of filesystem contained in an initial RAM disk, an initial RAM disk is typically compressed using gzip to save even more space.

Initial RAM disks are extremely popular in embedded Linux deployments for two reasons. First, they are effectively mandatory in embedded systems that do not have attached storage devices or which cannot depend on network access during the boot process. Secondly, they provide an easy way to load proprietary kernel modules that would otherwise have to be built into the kernel and would therefore be subject to the GNU Public License (GPL).

Using an Initial RAM Disk

In order to boot from an initial RAM disk, you must configure your kernel to support the type of filesystem in which your initial RAM disk was originally created, and you must also activate kernel configuration variables such as CONFIG_BLK_DEV_RAM and CONFIG_BLK_DEV_INITRD. The kernel knows how to uncompress the initial RAM disk image into memory, which it can then mount and access like any physical filesystem.

In embedded Linux scenarios, an initial RAM disk is often bundled into the Linux kernel during the kernel compilation process. This provides a single bootable entity that you can install into Flash or any other boot media that your embedded system's boot monitor can access. However, this also means that you must either prepare the initial RAM disk before you begin the kernel compilation process or that you must build it during the kernel compilation process.

Using an initial RAM disk has two potential problems. First, embedded systems with extremely limited amounts of RAM may not have enough to spare to hold the filesystem image. Second, an initial RAM disk provides no long-term storage-each time that you boot your embedded Linux system, the initial RAM disk is reloaded from the compressed image, which cannot be updated without recompilation. For these reasons, embedded systems with local storage typically install a root filesystem there. This takes full advantage of your available hardware and, in the case of writable storage media such as Flash, Compact Flash, Disk-On-Chip, or devices such as hard drives, provides a location where your system can maintain state and log information across reboots.

Using Other Types of Root Filesystems with Embedded Linux

Root filesystems in formats such as the Journaling Flash Filesystem (JFFS2) are typically used on systems with Flash memory that can be partitioned into multiple sections, usually containing the boot monitor, the loadable kernel image, and a JFFS2 filesystem. Systems with attached devices such as Compact Flash or hard drives typically use a journaling filesystem such as EXT3, XFS, or JFS. Journaling filesystems reduce system restart time by minimizing the chances that a filesystem will be left in an inconsistent state by a system crash or unplanned restart.

Finally, network filesystems are often used as root filesystems during the embedded Linux development process, because they provide more storage than is typically available on an embedded system, and also because they can easily preserve debugging and other program and system state information across restarts of your embedded system, since the storage is not physically located on your embedded system.

Regardless of the type of root filesystem that you want to use in your embedded Linux system, manually creating a root filesystem using traditional methods requires some specialized Linux knowledge. You have to be familiar with the commands used to create filesystems of each type. If you are creating an initial RAM disk, you have to create an empty file, associate that file with a Linux device, create the filesystem, and then mount that file as a special type of virtual device known as a loopback device in order to populate it. Depending on how your embedded system accesses devices, you may also have to know the specialized Linux commands required to create device nodes and their naming conventions. In order to create a root filesystem of any type, you have to understand the organization of a Linux root filesystem.

As discussed in the remainder of this white paper, graphical embedded development tools, such as TimeStorm Integrated Development Environment (IDE) and TimeStorm Linux Development Suite (LDS), can expedite and simplify development by encapsulating much of the specialized Linux knowledge that you would otherwise have to master and memorize. Figure 1 shows the TimeStorm LDS RFS Image Options screen, on which you can specify the type of filesystem that you want to create.

Figure 1: Specifying Filesystem Type Information in TimeStorm LDS
(Click to enlarge)
Using TimeStorm LDS to create a root filesystem can be as simple as specifying its type, selecting the packages that you want to include, defining any special behavior you're interested in, and clicking a button.

Overview of the Linux Startup Process

Though TimeStorm LDS makes it easy to assemble a root filesystem that is suitable for running your Linux system and application(s), it's important to understand the applications that your system may execute during the boot process, in order to make sure that they are present in your root filesystem. As we'll see later, TimeStorm LDS makes it easy to identify missing components of a root filesystem. This section discusses the files used during the Linux boot process and the infrastructure required for running many Linux applications, in order to provide some background for later sections.

All Linux systems start in essentially the same way, with one minor but significant difference if you are using an initial RAM disk as part of the boot process. After loading the kernel into memory and executing it, Linux systems execute a system application known as the init (initialization) process, which is typically found in /sbin/init on Linux systems. The init process is process number 1 on the system, as shown in a process status listing produced using the "ps" command, and is therefore the ancestor of all other processes on your system. The init process reads the file /etc/inittab to identify the way in which the system should boot and lists all other processes and programs that it should start.

When you boot a Linux system that uses an initial RAM disk, the boot sequence includes one extra step. Before executing the init process, the system uncompresses and mounts the initial RAM disk, and then executes the file /linuxrc (Linux Run Commands). This file must therefore be executable, but can be a command file that lists other commands to execute, can be a multi-call binary such as BusyBox, or can simply be a symbolic link to a multi-call binary or to the /sbin/init process itself.

The /linuxrc File in an Initial RAM Disk

Executing the file /linuxrc is done as a step in the initial RAM disk's mount process, as specified in the kernel source file init/do_mounts_initrd.c. A sample /linuxrc file, where the /linuxrc file in your initial RAM disk is actually a command script (taken from a generic Red Hat 9 system) is the following:

#!/bin/nash

echo Mounting /proc filesystem
mount -t proc /proc /proc
echo Creating block devices
mkdevices /dev
echo Creating root device
mkrootdev /dev/root
echo 0x0100 > /proc/sys/kernel/real-root-dev
echo Mounting root filesystem
mount -o defaults --ro -t ext3 /dev/root /sysroot
pivot_root /sysroot /sysroot/initrd
umount /initrd/proc

As you can see from this example, this sample /linuxrc file executes a number of commands that help initialize the system. The last commands in this command file mount the root filesystem on a local device and use the pivot_root command to change the system's idea of the root ('/') directory. Systems that offer local storage and want to use a filesystem that it contains as the root filesystem, but which also use an initial RAM disk, use the pivot_root command, included in the linux-utils package, to change the system's root directory from the initial RAM disk to the device that actually provides your long-term storage. This device is usually identified through a kernel boot argument.

On embedded systems that use an initial RAM disk and where you do not need to load any additional modules, perform any additional commands during the boot process, and so on, the /linuxrc file is often a symbolic link to the /sbin/init program discussed earlier in this section. This is an optimization-if the /linuxrc file is an actual command-file, your Linux system typically executes the /sbin/init process when it finishes executing the /linuxrc command file.

Components of a Root Filesystem

Regardless of whether your root filesystem is an initial RAM disk, local storage, or a network root filesystem, that filesystem must provide any kernel modules that you want to load during the boot process, all of the Linux commands the system needs to execute, any custom applications that you want to run on your Linux system, and entries for all of the devices that you want to able to access from your system. It also needs to provide any Linux infrastructure that the system needs in order to execute those applications.

Inter-Package Dependencies

Linux applications are generally provided as part of packages that contain a group of related applications. For example, the module-init-tools package on a 2.6 Linux system (known as modutils on 2.4 and early Linux systems) contains the applications that Linux systems use to insert, remove, and query loadable kernel modules, such as /sbin/depmod, /sbin/rmmod, /sbin/insmod, and so on. When you construct a root filesystem, these applications must be present in your root filesystem in order to perform any of those functions. Similarly, to log in on a Linux system over the network, that system must be running a server process such as the telnet or SSH daemons in order to receive your remote request. These servers are typically found in the telnet-server and openssh-server packages, respectively. It must also be able to initiate a login process of some sort (such as getty, mingetty, or agetty), and then execute your login shell (usually bash) and other commands that the login process requires.

To make mandatory packages available, most Linux distributions, such as those available from TimeSys, include sets of packages that are pre-compiled for your target hardware and from which you can pick and choose when creating a root filesystem. Figure 2 shows the TimeStorm LDS Package Selection screen, which enables you to select from the packages available for installation in your root filesystem. This list includes any Red Hat Package Managers (RPMs) or TimeStorm projects that you have created in any of your TimeStorm workspaces. The list shown in the figure is for a customized RFS based on a subset of available packages.

Figure 2: TimeStorm LDS Package Selection Screen
(Click to enlarge)
Unfortunately, the relationships between the applications contained in different packages are not always easy to identify unless you are a Linux wizard-or have access to a tool that automates this sort of relationship

analysis, such as TimeStorm LDS.

Library Dependencies

Even after mastering packages and their relationships, an additional source of complexity in many embedded Linux systems is that Linux applications are often compiled using shared libraries. One of the advantages of Linux is that it provides a rich program compilation and execution environment. Thousands of libraries of pre-compiled functions are available, which your applications can use to minimize the number of times that you have to reinvent the wheel. Linux supports two different types of libraries, known as static and shared libraries.

Static libraries are libraries of precompiled functions that your application links to when it is compiled, resolving any function references at that time. Your application's binary therefore literally contains a copy of each library that it is statically linked to, but can be executed without those libraries actually being installed on your system, because the libraries are actually compiled into your applications. This leads to larger, but more self-sufficient, binaries.

Shared libraries are libraries of commonly-used functions that applications can link to when you execute those applications (i.e., at run-time), rather than when they are compiled. Binaries that use shared libraries are therefore much smaller than statically-linked binaries, but the libraries that they require must be present on your system in order to execute them. A run-time loader and the library that it requires must also be present on your system in order to run those binaries.

Understanding the hierarchy of commands that standard Linux services requires, the relationships between the commands in different Linux packages, and the shared libraries that these different commands can require is known as "dependency analysis." Dependency analysis is important on any Linux system in order to ensure that you will be able to run the services and commands that you need to execute. It is even more important in embedded Linux systems, where resources are limited and you may therefore need to make your root filesystem as small as possible without sacrificing functionality.

Linux provide a number of specialized command-line tools in order to understand package relationships and perform dependency analysis, each of which uses different syntax and requires special expertise. An alternative to becoming a Linux wizard at this level is to use a graphical tool such as TimeStorm LDS, which encapsulates all of the Linux expertise you'll need to perform this sort of analysis, and does it for you when you simply select a menu command.

Figure 3 shows the TimeStorm LDS Library Dependencies screen, displaying information about the packages and files provided and required by the e2fsprogs package, a package of utilities for creating and maintaining EXT2 and EXT3 filesystems.

Figure 3: TimeStorm LDS Package Selection Screen
(Click to enlarge)
Standards for Filesystem Content and Organization

Many embedded Linux systems are designed in-house, or for specialized purposes such as supporting a single, dedicated application or set of related applications. These custom, single-purpose systems tend to use root filesystems that include only the applications, utilities, and infrastructure that they require in order to boot and run.

As Linux systems become more widely used, especially within industry segments such as telecommunications, many software vendors are beginning to develop applications that are designed to run on a wide range of Linux-based systems. These systems may or may not even use the same Linux distribution, which makes it far more difficult to prepare for and address differences in the packages and infrastructure that are available on these systems.

The Linux community has proposed various standards that are designed to address and eliminate these sorts of problems. The best-known of these are the FHS (Filesystem Hierarchy Standard), LSB (Linux Standard Base), and CGL (Carrier Grade Linux) standards, which do the following:

FHS defines a standard set of utilities and associated infrastructure and where those utilities and associated files should be located in a compliant Linux filesystem. Being able to ensure that FHS-compliant filesystems will contain specific applications in a known location makes it easy for Linux command scripts and applications to leverage and invoke other Linux applications, which is the essence of the Unix application model in the first place.

LSB mandates FHS, adding the notion of a compliant run-time environment that will enable pre-compiled binaries to run on any other LSB-compliant system.

CGL mandates LSB, adding a number of performance and functionality requirements at both the kernel and filesystem level. The CGL specification is designed to make sure that compliant systems satisfy the needs of the telecommunications industry, but much of it is applicable to any high-availability environment.

Standards such as CGL are actively under development themselves. For example, two versions of the CGL specification are currently available. The older CGL 1.1 standard is supported by several embedded Linux distributions, such as those from MontaVista. More modern Linux distributions, such as the 2.6-based CGL Reference Distribution available from TimeSys, target the more extensive and up-to-date CGL 2.0 specification.

Summary

The Linux boot process, its interactions with physical and in-memory filesystems, the sequence of command scripts executed as a Linux system boots, and the dependencies between commands, files, and libraries can be quite complex. Graphical tools such as the TimeStorm Target Configurator from TimeSys, a component of TimeStorm LDS, make it easy to create initial RAM disks and other types of filesystems that contain the system software that you need at boot time or run time. TimeStorm LDS also simplifies adding your applications to the root filesystem that you want to deploy, whether this will be an initial RAM disk, located on flash storage, or stored on physical media such as a disk drive. The TimeStorm tools are powerful, easy-to-use, graphical tools that build in much of the specialized knowledge traditionally required when working with Linux systems, letting you focus on your application and implementation rather than having to become a Linux wizard along the way.
About the author

William von Hagen is a Senior Product Manager at TimeSys Corp., has been a Unix devotee for over twenty years, and has been a Linux fanatic since the early 1990s. He has worked as a system administrator, writer, developer, systems programmer, drummer, and product and content manager. Bill is the author of Linux Filesystems, Hacking the TiVo, SGML for Dummies, Installing Red Hat Linux 7, and is the coauthor of The Definitive Guide to GCC (with Kurt Wall) and The Mac OS X Power Users Guide (with Brian Profitt). Linux Filesystems is available in English, Spanish, Polish, and traditional Chinese. Bill has also written for publications including Linux Magazine, Mac Tech, Linux Format, and online sites such as Linux Planet and Linux Today. An avid computer collector specializing in workstations, he owns more than 200 computer systems.

Friday, May 22, 2009

how to set up a VPN with Openswan combined with L2TPD

This document describes how to set up a VPN with Openswan combined with L2TPD. This provides for a more user-friendly experience than a standard IPSec VPN on many client operating systems. Note that for most site<->site VPN's, you will still want straight IPSec.

If you're not sure if IPSec is right for you, I have written a quick document about some of the various types of VPN available under Linux. It is available at: http://www.natecarlson.com/linux/linux-vpn.php. I hope this helps clear up some questions.

This page is heavily based on my basic IPSec configuration page, located at http://www.natecarlson.com/linux/ipsec-x509.php. The l2tpd configuration side is based on Jacco de Leeuw's page, which is the definitive source for anything related to Openswan and L2TP. I'm just trying to simplify things for the average Linux geek -- if you need more detailed information, or information about any clients other than Windows, check out his page. If you have any difficulties with this process, please e-mail the Openswan mailing list, or if you can't get help from there, e-mail me at: ipsec@natecarlson.com.

All of my examples on this page are based on a Debian Sarge system, since all the packages required are readily available. Most examples are readily portable to other distributions; you will just need to get the required software for that distribution.

NOTE: I do occasionally post notes about new VPN options and such on my blog; see the VPN category at: http://www.natecarlson.com/blog/category/geek-stuff/vpn. Also, if you are interested in consulting services to help you set things up, I am available on a very limited basis - please see my consulting page.

Contents:
Changes made to this document
Setting up a Certificate Authority
Generating a Certificate
Installing Openswan
Installing the Certificate on your Gateway
Configuring Openswan on the Gateway Machine
Configuring l2tpd on the Gateway Machine
Client Setup: Windows XP
Client Setup: Real IPSec Clients
Some common errors, and resolutions for them
References used to write this document
Changes made to this document
$Id: ipsec-l2tp.php,v 1.11 2006/07/10 15:19:36 natecars Exp $
[03/18/05] Added directions to disable Opportunistic Encryption.
[01/18/05] Initial revision; based on my IPSec X.509 page.

Setting up your Certificate Authority
I'm assuming you want to use X.509 certificates for authentication. It may be possible to get this working with pre-shared keys, but I haven't tried it. I am also assuming that you will need your own Certificate Authority dedicated to VPN usage - if you already have access to a CA, you may just want to generate certificates from there (if that's the case, you can just skim this section.) If you need more details that I am going into here, please read the OpenSSL documentation -- it's fairly detailed. For CA certificate management, my examples use the utilities included with OpenSSL itself - there are third-party tools out there that make this a bit simpler, but I want to keep dependencies low. Note that you do not necessarily need to use your Openswan gateway as the Certificate Authority - it can be any box with OpenSSL installed. In fact, it may be better to use a different box, so if an attacker gains access to your Openswan gateway they don't have access to your CA, too. If you have any suggestions on how to make this process simpler, please let me know!

Now, on to the good stuff - let's start setting up our own CA.

1) Install openssl. On Debian, 'apt-get install openssl' will take care of this.
2) Find your openssl.cnf file. This file has default values for OpenSSL certificate generation. Here's a few locations for various distributions:

Debian: /etc/ssl/openssl.cnf
RedHat 7.x+: /usr/share/ssl/openssl.cnf

Open this file in your favorite editor. We will need to change the following options:

'default_days': This is the length of time, in days, that your certificates will be valid for, and defaults to 365 days, or 1 year. I recommend setting this to '3650', as that will give you 10 years of validity on your certificates. Since this is for internal use, I am ok with the security ramifications of having a certificate valid for a long time - if you lose it or whatnot, you can revoke it without a problem.

'[ req_distinguished_name ]' section: You don't really *need* to change the options below req_distinguished_name; they just set the default options (such as location, company name, etc) for certificate generation. I find it's easier to set them here than re-type them for every certificate.

3) Create a directory to house your CA. I generally use something like /var/sslca; you can really use whatever you want. Change the permissions of the directory to 700, so that people will not be able to access the private keys who aren't supposed to.

4) Find the command 'CA.sh' (some distributions rename it to just 'CA'; don't ask me why.) Locations on various distributions:

Debian: /usr/lib/ssl/misc/CA.sh
RedHat 7.x+: /usr/share/ssl/misc/CA

Edit this file, and change the line that says 'DAYS="days 365"' to a very high number (this sets how long the certificate authority's certificate is valid.) Be sure that this number is higher than the number is Step 1; or else Windows may not accept your certificates. Note that if this number is too high, it can cause problems - I generally set it for 15-20 years.

5) Run the command 'CA.sh -newca'. Follow the prompts, as below. Example input is in red, and my comments are in blue. Be sure to not use any non-alphanumeric characters, such as dashes, commas, plus signs, etc. These characters may make things more difficult for you.

nate@example:~/sslca$ /usr/lib/ssl/misc/CA.sh -newca
CA certificate filename (or enter to create)
(enter)
Making CA certificate ...
Using configuration from /usr/lib/ssl/openssl.cnf
Generating a 1024 bit RSA private key
.............................................................................+++
........................................+++
writing new private key to './demoCA/private/./cakey.pem'
Enter PEM pass phrase:(enter password) This is the password you will need to create any other certificates.
Verifying password - Enter PEM pass phrase:(repeat password)
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:US(enter) Enter your country code here
State or Province Name (full name) [Some-State]:State(enter) Enter your state/province here
Locality Name (eg, city) []:City(enter) Enter your city here
Organization Name (eg, company) [Internet Widgits Pty Ltd]:ExampleCo(enter) Enter your company name here (or leave blank)
Organizational Unit Name (eg, section) []:(enter) OU, if you like. I usually leave it blank.
Common Name (eg, YOUR name) []:CA(enter) The name of your Certificate Authority
Email Address []:ca@example.com(enter) E-Mail Address
nate@example:~/sslca$

Let's also generate a crl file, which you'll need on your gateway boxes:
nate@example:~/sslca$ openssl ca -gencrl -out crl.pem
You'll need to update this CRL file any time you revoke a certificate.

That's it, you now have your own certificate authority that you can use to generate certificates.

Generating a Certificate
You will need to generate a certificate for every machine that will be making an IPSec connection. This includes the gateway host, and each of your client machines. This section details how to create the certificate, and convert it to formats needed for Windows and such.

Again, we'll be using the CA.sh script. Except this time, instead of telling it to create a new Certificate Authority, we're telling it to request, then sign a certificate:

nate@example:~/sslca$ /usr/lib/ssl/misc/CA.sh -newreq
Using configuration from /usr/lib/ssl/openssl.cnf
Generating a 1024 bit RSA private key
...................................+++
...............................+++
writing new private key to 'newreq.pem'
Enter PEM pass phrase:(enter password) Password to encrypt the new cert's private key with - you'll need this!
Verifying password - Enter PEM pass phrase:(repeat password)
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:US(enter)
State or Province Name (full name) [Some-State]:State(enter)
Locality Name (eg, city) []:City(enter)
Organization Name (eg, company) [Internet Widgits Pty Ltd]:ExampleCo(enter)
Organizational Unit Name (eg, section) []:(enter)
Common Name (eg, YOUR name) []:host.example.com(enter)This can be a hostname, a real name, an e-mail address, or whatever
Email Address []:user@example.com(enter) (optional)

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:(enter)
An optional company name []:(enter)
Request (and private key) is in newreq.pem

What we just did is generate a Certificate Request - this is the same type of request that you would send to Thawte or Verisign to get a generally-accepted SSL certificate. For our uses, however, we'll sign it with our own CA:

nate@example:~/sslca$ /usr/lib/ssl/misc/CA.sh -sign
Using configuration from /usr/lib/ssl/openssl.cnf
Enter PEM pass phrase:(password you entered when creating the ca)
Check that the request matches the signature
Signature ok
The Subjects Distinguished Name is as follows
countryName :PRINTABLE:'US'
stateOrProvinceName :PRINTABLE:'State'
localityName :PRINTABLE:'City'
organizationName :PRINTABLE:'ExampleCo'
commonName :PRINTABLE:'host.example.com'
emailAddress :IA5STRING:'user@example.com'
Certificate is to be certified until Feb 13 16:28:40 2012 GMT (3650 days)
Sign the certificate? [y/n]:y(enter)

1 out of 1 certificate requests certified, commit? [y/n]y(enter)
Write out database with 1 new entries
Data Base Updated
(certificate snipped)
Signed certificate is in newcert.pem

Next, move the output files to names that make a bit more sense for future reference.

nate@example:~/sslca$ mv newcert.pem host.example.com.pem
nate@example:~/sslca$ mv newreq.pem host.example.com.key

That's all that's required for Openswan boxes - you'll need these two files, along with the file 'cacert.pem' from the 'demoCA' directory, and the 'crl.pem' file you generated earlier.
If this certificate is needed for a Windows box, you'll need to convert it to a p12 format:
$ openssl pkcs12 -export -in winhost.example.com.pem -inkey winhost.example.com.key -certfile demoCA/cacert.pem -out winhost.example.com.p12

Installing Openswan
You'll need to install Openswan each Linux box you want to speak IPSec. This section covers installing the actual software..

If you are running Debian, there are binary packages available in Sarge and above. For RedHat or Fedora, ATrpms provides binary packages. I can't vouch for the quality of these packages, but I do know many people have used them with good success. See http://atrpms.net. If you want to build it from scratch, you can download it from http://www.openswan.org/code, and follow the installation directions included with the package. I recommend the most recent version in the 2.2 series, until 2.3.1 is available - 2.3.0 has some critical bugs.

You now have two options for which IPSec stack you want to install in the kernel - you can either use Openswan's IPSec stack (KLIPS), or use the built-in IPSec stack in the 2.6 kernel (26sec). If you are running on a stock 2.4 kernel, the only option is KLIPS. You'll need to patch NAT Traversal support into your kernel (if you intend to use it), and build the ipsec.o kernel module. Otherwise, if you are using a 2.6 kernel or a 2.4 kernel with backported 26sec support (such as the kernel Debian provides), you don't need to touch the kernel-land at all - you can just install the Openswan user-land utilities and go. With Openswan 2.3.1, we will also have support for KLIPS on 2.6, but without NAT Traversal support (until someone gets around to fixing it!) My current recommendation (and my only tested configuration) is to use a stock kernel, patched with NAT Traversal and with KLIPS added. If you bug me, I'll probably provide patched up Debian packages. :) I have heard stories about l2tpd not working with the kernel stack.

Once you've selected and set up your IPSec stack and installed the user-land programs, you're ready to move on to configuring Openswan.

Installing the Certificate on your Gateway
This discusses how to install the certificate on your gateway machine. These same steps apply for installing the cert on Openswan clients, too. I'm assuming you've already created a certificate for each machine (see the "Generating a Certificate" section) - if that's not the case, please go back and do that now.

1) Install the files in their proper locations (if installing to a remote machine, please be sure to copy the files in a secure manner):

$ cp /var/sslca/host.example.com.key /etc/ipsec.d/private
$ cp /var/sslca/host.example.com.pem /etc/ipsec.d/certs
$ cp /var/sslca/demoCA/cacert.pem /etc/ipsec.d/cacerts
$ cp /var/sslca/crl.pem /etc/ipsec.d/crls/crl.pem

Configuring Openswan on the Gateway Machine
1) Configure ipsec.secrets:
/etc/ipsec.secrets should contain the following:

: RSA host.example.com.key "password"

The password above should be the password you entered while generating the SSL certificate.

2) Configuring ipsec.conf
/etc/ipsec.conf should look something like the configuration below (note that the indentation is important; without it, openswan will fail):

version 2.0

config setup
interfaces=%defaultroute
nat_traversal=yes
virtual_private=%v4:10.0.0.0/8,%v4:172.16.0.0/12,%v4:192.168.0.0/16

conn %default
keyingtries=1
compress=yes
disablearrivalcheck=no
authby=rsasig
leftrsasigkey=%cert
rightrsasigkey=%cert

conn roadwarrior-net
leftsubnet=(your_subnet)/(your_netmask)
also=roadwarrior

conn roadwarrior-all
leftsubnet=0.0.0.0/0
also=roadwarrior

conn roadwarrior
left=%defaultroute
leftcert=host.example.com.pem
right=%any
rightsubnet=vhost:%no,%priv
auto=add
pfs=yes

conn roadwarrior-l2tp
type=transport
left=%defaultroute
leftcert=host.example.com.pem
leftprotoport=17/1701
right=%any
rightprotoport=17/1701
pfs=no
auto=add

conn roadwarrior-l2tp-oldwin
left=%defaultroute
leftcert=host.example.com.pem
leftprotoport=17/0
right=%any
rightprotoport=17/1701
rightsubnet=vhost:%no,%priv
pfs=no
auto=add

conn block
auto=ignore

conn private
auto=ignore

conn private-or-clear
auto=ignore

conn clear-or-private
auto=ignore

conn clear
auto=ignore

conn packetdefault
auto=ignore

The 'roadwarrior-*' lines allow roadwarriors (IE, regular IPSec clients) to connect to your IPSec gateway itself, the network behind it, and to tunnel all traffic to the 'net at large through it. The roadwarrior-l2tp entries allow both older and newer versions of Windows to connect to an l2tpd daemon running on the same host as your Openswan gateway. Anyone will a valid certificate signed by your CA will be able to connect to your gateway. This configuration also includes NAT Traversal configuration that will allow anyone a host behind a NAT gateway using RFC1918 private addresses (defined in the 'virtual_private' line) to connect. The 'auto=ignore' lines are there to disable Opportunistic Encryption, which can cause problems if not configured properly.

Configuring l2tpd on the Gateway Machine

1) Install l2tpd. On Debian (assuming you have 'unstable' in your sources.list), you can just 'apt-get install l2tpd'; on other distributions, you can find a binary distribution, or grab the source from http://www.l2tpd.org. If building from source, you proably want to build from the CVS version.

2) Configure l2tpd. On Debian, you'll need to edit the file '/etc/l2tpd/l2tpd.conf'. Here's an example:

[global]
auth file = /etc/l2tpd/l2tp-secrets
[lns default]
ip range = 192.168.100.240-192.168.100.250
local ip = 192.168.100.254
require chap = yes
refuse pap = yes
require authentication = yes
name = MyVPN
ppp debug = yes
pppoptfile = /etc/ppp/options.l2tpd.lns
length bit = yes

You'll need to change the IP range to a block of unused addresses on your internal network that you would like to hand out to L2TP clients. The 'Local IP' should be the local IP address of your box. The 'pppoptfile' specifies which options file to use.

3) Configure your PPP options. From the example above, this is located at /etc/ppp/options.l2tpd.lns.

ipcp-accept-local
ipcp-accept-remote
ms-dns 192.168.100.1
ms-wins 192.168.100.1
auth
crtscts
idle 1800
mtu 1200
mru 1200
nodefaultroute
debug
lock
proxyarp
connect-delay 5000
nologfd

You'll need to change ms-dns and ms-wins to match your internal DNS and WINS servers. I've got the MTU set rather low so that packets won't be fragmented - if you leave the MTU at 1500, you may find that things like SMB shares don't work properly.

4) Set up your authentication file. This is at /etc/ppp/chap-secrets.

# Secrets for authentication using CHAP
# client server secret IP addresses
username * password *

You can define multiple users with this method. If it's not obvious, 'username' is the username that will be used for authentication, and 'password' is the password. If you'd like to give a user a static IP, you can specify it in the fourth column, 'IP Addresses'.

That's it for the server side! Just start l2tpd with '/etc/init.d/l2tpd start', and you're set to go on to the clients.

Client Setup: Windows XP

This section covers configuring your Windows XP client to connect to the server with L2TP over IPsec.

First of all, please ensure that Windows XP SP2, or the NAT-Traversal patches are installed. This will help your ability to connect while behind a NAT gateway and such. Also, be sure to be logged in as a user with administrator privileges.

1) The first step is to import a certificate on your Windows box. For sake of simplicity, I'll have you import the certificate using Xelerance's 'certimport.exe' tool.

- Download certimport from ftp://ftp.openswan.org/openswan/windows/certimport/, extract it, and install certimport.exe somewhere easy to get at.
- Generate a certificate (as described above) for the box, and save the .p12 format file. Copy this file over to your Windows box in a temporary folder somewhere.
- Import the certificate with:

certimport.exe -p password certificate.p12

2) Set up your L2TP over IPSec connection, as follows.

- Start->Settings->Network Connections
- Create a New Connection
- Connect to the network at my workplace
- Virtual Private Connection
- Company Name: Your VPN Name
- Dial Connection: Yes or no, depending on your needs
- Host Name or IP: Hostname or IP to connect to
- Finish the connection, and go to the properties for it.
- Load the Networking tab
- Change the 'Type' to 'L2TP IPSec VPN'
- Save your settings.
- Enter the username and password.

3) Connect! The VPN should come up nicely - if not, check the Linux side for errors.

Client Setup: Real IPSec Clients

I'm just covering setting up L2TP over IPSec connections on this page, but if you would like to set up Openswan or Windows IPSec clients, please see my other page at http://www.natecarlson.com/linux/ipsec-x509.php. Note that the server configuration above is alreadty set up to accept normal IPSec connections along with the L2TP connections.

Some common errors, and resolutions for them

I'll add some common errors as I come by them.

References
Openswan Documentation: http://www.openswan.org
Jacco de Leeuw's Page: http://www.jacco2.dds.nl/networking/freeswan-l2tp.html

Cafehack Group

Thursday, May 28, 2009

Scripting the Vim editor, Part 1: Variables, values, and expressions

Cài đặt bộ gõ tiếng việt x-unikey trên Linux

Cài đặt bộ gõ tiếng việt x-unikey trên Linux

Sunday, May 24, 2009

Creating Filesystems Using TimeStorm LDS

Friday, May 22, 2009

how to set up a VPN with Openswan combined with L2TPD

Flattr this blog

Search

Let's go

Blog Archive

Contributors