Word Formatting Flexifilter
I'm just preparing for my Ignite speech on Tuesday and thought I'd include a Flexifilter preset that should eliminate any Word formatting from user input from a WYSIWYG editor. The filter is mostly based on the one posted by Michael Chris Neglia on his blog.
One thing to note: the HTML filter component of this preset should be adjusted to match the HTML tags allowed in your WYSIWYG profile. Also, if you're using WYSIWYG to assign classes, this preset will filter them out.
Here's the export:
a:9:{s:5:"label";s:11:"Word Filter";s:11:"description";s:71:"Removes formatting from content pasted from Word into a WYSIWYG editor.";s:2:"id";s:1:"2";s:7:"enabled";b:1;s:8:"advanced";b:1;s:5:"delta";s:1:"1";s:5:"cache";s:1:"0";s:10:"components";a:7:{i:0;a:3:{s:5:"class";s:22:"flexifilter_text_regex";s:8:"settings";a:3:{s:4:"find";s:20:"\[if[^\[]*?\[endif\]";s:7:"replace";s:0:"";s:4:"step";s:7:"process";}s:2:"id";s:2:"31";}i:1;a:3:{s:5:"class";s:39:"flexifilter_existing__filter__filter__0";s:8:"settings";a:4:{s:13:"filter_html_1";s:1:"1";s:14:"allowed_html_1";s:103:"<p> <div> <pre> <h1> <h2> <h3> <h4> <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img>";s:18:"filter_html_help_1";i:0;s:22:"filter_html_nofollow_1";i:0;}s:2:"id";s:2:"32";}i:2;a:3:{s:5:"class";s:22:"flexifilter_text_regex";s:8:"settings";a:3:{s:4:"find";s:12:" class=[^>]*";s:7:"replace";s:0:"";s:4:"step";s:7:"process";}s:2:"id";s:2:"33";}i:3;a:3:{s:5:"class";s:22:"flexifilter_text_regex";s:8:"settings";a:3:{s:4:"find";s:23:" style=('|")[^'"]*('|")";s:7:"replace";s:0:"";s:4:"step";s:7:"process";}s:2:"id";s:2:"34";}i:4;a:3:{s:5:"class";s:24:"flexifilter_text_replace";s:8:"settings";a:3:{s:4:"find";s:12:" ";s:7:"replace";s:6:" ";s:4:"step";s:7:"process";}s:2:"id";s:2:"37";}s:7:"id_next";i:38;s:9:"id_prefix";s:22:"flexifilter_component_";}s:3:"fid";s:3:"new";}Some people have had problems importing the above code, so here's a basic break down of the filter, in order, so you can recreate it:
- RegEx Text Replace: Find "
\[if[^\[]*?\[endif\]" and replace with an empty string. - HTML filter: Only allow the HTML tags you can create using your WYSIWYG editor (don't forget table tags!)
- RegEx Text Replace: Find "
class=[^>]*" and replace with an empty string. You can leave this out if you actually want your WYSIWYG editor to assign classes, but Word can sometimes include a lot of classes, which are annoying at worst. - RegEx Text Replace: Find "
style=('|")[^'"]*('|")" and replace with an empty string, once again. This removes all embedded CSS, which Word tends to bring in with it. - Text Replace: Find "
" and replace with " ". Word loves to include multiple non-breaking spaces.
I'm also including the array definition which should hopefully give you all the information you need to recreate the import.
array ( 'label' => 'Word Filter', 'description' => 'Removes formatting from content pasted from Word into a WYSIWYG editor.', 'id' => '2', 'enabled' => true, 'advanced' => true, 'delta' => '1', 'cache' => '0', 'components' => array ( 0 => array ( 'class' => 'flexifilter_text_regex', 'settings' => array ( 'find' => '\\[if[^\\[]*?\\[endif\\]', 'replace' => '', 'step' => 'process', ), 'id' => '31', ), 1 => array ( 'class' => 'flexifilter_existing__filter__filter__0', 'settings' => array ( 'filter_html_1' => '1', 'allowed_html_1' => '<p> <div> <pre> <h1> <h2> <h3> <h4> <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img>', 'filter_html_help_1' => 0, 'filter_html_nofollow_1' => 0, ), 'id' => '32', ), 2 => array ( 'class' => 'flexifilter_text_regex', 'settings' => array ( 'find' => ' class=[^>]*', 'replace' => '', 'step' => 'process', ), 'id' => '33', ), 3 => array ( 'class' => 'flexifilter_text_regex', 'settings' => array ( 'find' => ' style=(\'|")[^\'"]*(\'|")', 'replace' => '', 'step' => 'process', ), 'id' => '34', ), 4 => array ( 'class' => 'flexifilter_text_replace', 'settings' => array ( 'find' => ' ', 'replace' => ' ', 'step' => 'process', ), 'id' => '37', ), 'id_next' => 38, 'id_prefix' => 'flexifilter_component_', ), 'fid' => 'new', );