<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/rss2full.xsl" type="text/xsl" media="screen"?><?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/itemcontent.css" type="text/css" media="screen"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Aaron Rosenfeld</title>
	
	<link>http://aaron-rosenfeld.com</link>
	<description>Web Development, programming, mathematics, and other ramblings.</description>
	<pubDate>Wed, 19 Nov 2008 06:21:08 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6</generator>
	<language>en</language>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/AaronRosenfeld" type="application/rss+xml" /><item>
		<title>What Every Computer Scientist Should Know About Floating-Point Arithmetic</title>
		<link>http://feeds.feedburner.com/~r/AaronRosenfeld/~3/458046192/</link>
		<comments>http://aaron-rosenfeld.com/2008/11/19/floating-point-article/#comments</comments>
		<pubDate>Wed, 19 Nov 2008 06:21:08 +0000</pubDate>
		<dc:creator>Aaron Rosenfeld</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://aaron-rosenfeld.com/?p=110</guid>
		<description><![CDATA[Floating point numbers always seem to confuse programmers, including myself, that haven&#8217;t fully looked into the IEEE standard that dictates how they are stored and manipulated in most modern languages.  My System Architecture professor recommended this article to out class to help us understand a portion of the MIPS architecture.
It is somewhat lengthy at 72 [...]]]></description>
			<content:encoded><![CDATA[<p>Floating point numbers always seem to confuse programmers, including myself, that haven&#8217;t fully looked into the IEEE standard that dictates how they are stored and manipulated in most modern languages.  My System Architecture professor recommended <a href="http://docs.sun.com/source/806-3568/ncg_goldberg.html" target="_blank">this article</a> to out class to help us understand a portion of the MIPS architecture.</p>
<p>It is somewhat lengthy at 72 pages, but I can honestly say it is one of the most informative papers I have ever read.  Literally <em>everything</em> you have ever wanted to know about floating point standards is covered.  It covers everything from how the numbers are stored at the binary level all the way through how mathematical operations are performed on them.</p>
<p>Really, it&#8217;s worth reading.  Even if you don&#8217;t read all of the mathematical proofs at the end, at least glance at the IEEE standard way of storing floating-point numbers and the section on accuracy.  I guarantee it will make the whole topic far easier to grasp.</p>
<p>Link to article: <a href="http://docs.sun.com/source/806-3568/ncg_goldberg.html" target="_blank">http://docs.sun.com/source/806-3568/ncg_goldberg.html</a></p>
]]></content:encoded>
			<wfw:commentRss>http://aaron-rosenfeld.com/2008/11/19/floating-point-article/feed/</wfw:commentRss>
		<feedburner:origLink>http://aaron-rosenfeld.com/2008/11/19/floating-point-article/</feedburner:origLink></item>
		<item>
		<title>PHP/SMS Article Published In php|architect</title>
		<link>http://feeds.feedburner.com/~r/AaronRosenfeld/~3/406470434/</link>
		<comments>http://aaron-rosenfeld.com/2008/09/29/php-sms/#comments</comments>
		<pubDate>Mon, 29 Sep 2008 18:06:35 +0000</pubDate>
		<dc:creator>Aaron Rosenfeld</dc:creator>
		
		<category><![CDATA[Computer Science]]></category>

		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://aaron-rosenfeld.com/?p=43</guid>
		<description><![CDATA[Over the past few months I have been developing a free method of sending and receiving SMS messages via PHP.  As I continued to add features and turn the method into a library, I found more and more uses to connect my site to my phone.  I talked with a number of individuals [...]]]></description>
			<content:encoded><![CDATA[<p>Over the past few months I have been developing a free method of sending and receiving SMS messages via PHP.  As I continued to add features and turn the method into a library, I found more and more uses to connect my site to my phone.  I talked with a number of individuals about the process and they all seemed intrigued by the idea of not having to pay for 3rd party SMS-forwarders.</p>
<p>So, I decided to write an article about it with the hopes that it would benefit others.  php|architect was kind enough to support my writing and published the article in the <a href="http://www.phparch.com/c/magazine/issue/82" target="_blank">September issue</a>.  I&#8217;d like to thank Steph Fox for her ongoing help with both the technical and language aspects of the post.  This article would not have been possible without her excellent guidance.</p>
<p>For those of you looking for the classes referenced and explained in the article, they can both be downloaded below.  Note that the <strong>NotificationHandler</strong> class also includes logging functionality which was not covered in the article.</p>
<p><code><a href="http://aaron-rosenfeld.com/wp-content/plugins/download-monitor/download.php?id=1" title="Version 1.0.0 downloaded 89 times" >SMS Article Classes (1.51 KB)</a></code></p>
]]></content:encoded>
			<wfw:commentRss>http://aaron-rosenfeld.com/2008/09/29/php-sms/feed/</wfw:commentRss>
		<feedburner:origLink>http://aaron-rosenfeld.com/2008/09/29/php-sms/</feedburner:origLink></item>
		<item>
		<title>VPS Hosting Review - ServInt</title>
		<link>http://feeds.feedburner.com/~r/AaronRosenfeld/~3/389146954/</link>
		<comments>http://aaron-rosenfeld.com/2008/09/10/vps-hosting-servint/#comments</comments>
		<pubDate>Thu, 11 Sep 2008 00:25:52 +0000</pubDate>
		<dc:creator>Aaron Rosenfeld</dc:creator>
		
		<category><![CDATA[Servers]]></category>

		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://aaron-rosenfeld.com/?p=74</guid>
		<description><![CDATA[You know, over the past 8 years I have been through at least 10 different web hosts.  This is partially my fault - I&#8217;m an extremely picky customer.  I demand that a company has excellent customer support and maintains all of their guarantees, namely uptime.
I can say after all of these years, I have finally [...]]]></description>
			<content:encoded><![CDATA[<p>You know, over the past 8 years I have been through at least 10 different web hosts.  This is partially my fault - I&#8217;m an extremely picky customer.  I demand that a company has excellent customer support and maintains all of their guarantees, namely uptime.</p>
<p>I can say after all of these years, I have finally found the best company I&#8217;ve ever had the pleasure of dealing with.  <a href="http://www.servint.net" target="_blank">ServInt</a> is a small (in number of employees) VPS-only web host who maintains the best business model anyone could ever ask for.  They have been around since 1995 and have a spotless track record.  Their servers are incredibly reliable and I have not had a single outage since I started using them.  Their prices are extremely reasonable starting at only $50/mo which is still an excellent package including 1 GB burstable memory, half a terabyte of transfer, 15 GB of storage, and 4 dedicated IPs.  Both storage and transfer quantities are easily upgradable and are both extremely reasonable in price.  All packages get root access and free backups in case of an outage or user error on the admin&#8217;s part.  Cpanel/WHM or Plesk is also included at no charge.</p>
<p>That being said, there are dozens of companies that can provide more for the same price.  What ended up selling me was their customer support.  I e-mailed their billing department before signing up for hosting and was amazed when a real human, with a name and personal phone-number responded to me.  She was extremely well spoken and went above and beyond what was required of her.</p>
<p>The next day when I signed up, I promptly received a turnup e-mail and went along my way setting up various sites on the new VPS.  I almost had a heart attack when a customer service rep called me and asked if he could assist with migration or any software installation.  I was floored.  A proactive phone call from a hosting company?  Amazing.</p>
<p>Customers are given the standard panel for billing, support tickets, and other miscellaneous services but are also given access to a forum.  Not a dead, useless forum - a lively, active forum.  Other customers regularly discuss their experiences and <strong>collaborate</strong> with ServInt (not just talk one-way) to solve problems and talk about hosting technology.  Not only does customer support and their NOC staff post on these forums, but the CEO himself frequents the forums and regularly contributes to discussions.  Now thats different!</p>
<p>This wasn&#8217;t endorsed in any way by ServInt but they are running a truly excellent business and they deserve to be recognized.  They are by far the most personal hosting company I have delt with and I recommend them to anyone looking for a VPS hosting solution.</p>
<p><a href="http://www.servint.net/index.php?refid=CAB442911874"><img src="http://img.servint.net/120x60a.gif" border="0" alt="" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://aaron-rosenfeld.com/2008/09/10/vps-hosting-servint/feed/</wfw:commentRss>
		<feedburner:origLink>http://aaron-rosenfeld.com/2008/09/10/vps-hosting-servint/</feedburner:origLink></item>
		<item>
		<title>PHP v5.3 Alpha 1</title>
		<link>http://feeds.feedburner.com/~r/AaronRosenfeld/~3/377679965/</link>
		<comments>http://aaron-rosenfeld.com/2008/08/28/php-v53-alpha1-released/#comments</comments>
		<pubDate>Fri, 29 Aug 2008 02:01:33 +0000</pubDate>
		<dc:creator>Aaron Rosenfeld</dc:creator>
		
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://aaron-rosenfeld.com/?p=65</guid>
		<description><![CDATA[I usually don't update my blog for PHP versions but the announcement made at the beginning of the month really caught my eye.  Changes to the Tick construct, Ternary operator, and the addition of the goto keyword are all included.]]></description>
			<content:encoded><![CDATA[<p>I usually don&#8217;t update my blog for PHP versions but the <a href="http://www.php.net/archive/2008.php#id2008-08-01-1" target="_blank">announcement made at the beginning of the month</a> really caught my eye.</p>
<p><strong>declare(Ticks=N) Depreciated</strong><br />
First off, they finally depreciated the declare(Ticks&#8230;) construct.  I never really found a use for this and actually had it crash a web server due to threading issues.  Thankfully, PHP is no longer formally supporting this function along with register_tick_function.</p>
<p><strong>Ternary Default</strong><br />
In the last few months I have found myself using the ternary operator extensively to save space.  Most of the time it ends up being in the form:</p>
<pre>$var .= $other_var ? 'some string' : '';</pre>
<p>Thankfully this can now be shortened to simply:</p>
<pre>$var .= $other_var ? 'some string' :;</pre>
<p>to omit the false case, and to:</p>
<pre>$var .= $other_var ?: 'some other string';</pre>
<p>to omit the true case.</p>
<p><strong>Namespace</strong><br />
For some reason namespace isn&#8217;t currently a reserved word.  In v5.3 Alpha 1, it is.</p>
<p><strong>goto Keyword</strong><br />
This one I don&#8217;t like.  PHP decided to add the dreaded goto/label: construct.  I have absolutely no clue why they would do this since it is never needed in modern programming.</p>
<p>Nothing is groundbreaking but most of the changes are a breath of fresh air.  My only concern is <code>goto</code> which I hope does not make it to the release branch&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://aaron-rosenfeld.com/2008/08/28/php-v53-alpha1-released/feed/</wfw:commentRss>
		<feedburner:origLink>http://aaron-rosenfeld.com/2008/08/28/php-v53-alpha1-released/</feedburner:origLink></item>
		<item>
		<title>phpWatch Release Announcement</title>
		<link>http://feeds.feedburner.com/~r/AaronRosenfeld/~3/377247321/</link>
		<comments>http://aaron-rosenfeld.com/2008/08/28/phpwatch-release-announcement/#comments</comments>
		<pubDate>Thu, 28 Aug 2008 16:06:55 +0000</pubDate>
		<dc:creator>Aaron Rosenfeld</dc:creator>
		
		<category><![CDATA[Databases]]></category>

		<category><![CDATA[Servers]]></category>

		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://aaron-rosenfeld.com/?p=51</guid>
		<description><![CDATA[I am pleased to announce that I have released phpWatch v1.0.6 Beta and it is now available for download.
phpWatch is a general purpose service monitor that is able to send notifications of outages via e-mail or text-message (SMS).  The purpose of this system is two-fold: it allows administrators to easily check the status of [...]]]></description>
			<content:encoded><![CDATA[<p>I am pleased to announce that I have released phpWatch v1.0.6 Beta and it is now available for download.</p>
<p>phpWatch is a general purpose service monitor that is able to send notifications of outages via e-mail or text-message (SMS).  The purpose of this system is two-fold: it allows administrators to easily check the status of many different services running on any number of servers and also allows developers to interface with the query and notification APIs.</p>
<p>A demonstration of the administrator view is available at <a href="http://aaron-rosenfeld.com/phpWatch/demo" target="_blank">here</a>.  The AJAX based user-interface makes it simple to view, add, edit, and delete service monitors as well as notifications on a single page.  The configuration page allows users to customize and format the notifications that are sent out during a service outage.  Note that when a new version of phpWatch is released, the bottom of the page (where it shows the phpWatch version number) will be updated with a notification.</p>
<p>The developer documentation is available <a href="http://aaron-rosenfeld.com/phpWatch/demo/docs" target="_blank">here</a>.  The API allows other PHP scripts to query the monitored services in real-time, gather statistics about services, and interact with the notification system to send SMS and e-mail alerts.</p>
<p>If you find any bugs or have a suggestion, please feel free to add them to the bug-tracker on Sourceforge or send an e-mail to me at <em>aaron (at) aaron-rosenfeld (dot) com</em>.</p>
<p><strong>Demonstration Page</strong><br />
<a href="http://aaron-rosenfeld.com/phpWatch/demo"> http://aaron-rosenfeld.com/phpWatch/demo</a></p>
<p><strong>Developer Documentation</strong><br />
<a href="http://aaron-rosenfeld.com/phpWatch/demo/docs"> http://aaron-rosenfeld.com/phpWatch/demo/docs</a></p>
<p><strong>Download Link</strong><br />
<a href="http://aaron-rosenfeld.com/wp-content/plugins/download-monitor/download.php?id=2" title="Version 1.0.6 Beta downloaded 62 times" >phpWatch v1.0.6 Beta</a> - 56kb</p>
<p><a href="http://sourceforge.net/project/showfiles.php?group_id=233530&amp;package_id=283373"></a></p>
]]></content:encoded>
			<wfw:commentRss>http://aaron-rosenfeld.com/2008/08/28/phpwatch-release-announcement/feed/</wfw:commentRss>
		<feedburner:origLink>http://aaron-rosenfeld.com/2008/08/28/phpwatch-release-announcement/</feedburner:origLink></item>
		<item>
		<title>Computer Science vs. IT</title>
		<link>http://feeds.feedburner.com/~r/AaronRosenfeld/~3/339137595/</link>
		<comments>http://aaron-rosenfeld.com/2008/07/18/computer-science-vs-it/#comments</comments>
		<pubDate>Fri, 18 Jul 2008 16:26:54 +0000</pubDate>
		<dc:creator>Aaron Rosenfeld</dc:creator>
		
		<category><![CDATA[Computer Science]]></category>

		<guid isPermaLink="false">http://aaron-rosenfeld.com/?p=19</guid>
		<description><![CDATA[<p>A discussion of the differences between Computer Science and Information Technology (IT).  I tried to be as neutral as possible and covered the strengths and weaknesses of both fields.</p>]]></description>
			<content:encoded><![CDATA[<p>I am always excited when someone asks me what I&#8217;m majoring in at college - I promptly reply, &#8220;Computer Science&#8221; and brace for the reaction.  Nine times out of ten I get the dreaded response along the lines of, &#8220;Oh that&#8217;s a great field!  I know there are a lot of IT jobs now-a-days&#8221;.</p>
<p>What I have realized is the vast majority of individuals who have non-technology jobs consider anyone who &#8220;works with computers&#8221; to be IT and want to make it clear that:</p>
<p style="text-align: center;"><strong>COMPUTER SCIENCE IS NOT INFORMATION TECHNOLOGY</strong></p>
<p><em>Note</em>: Before I even start, I want to make it clear that I did not indent this article to favor IT or Computer Science but I am obviously bias towards my major of Computer Science.  I apologize if any of the information is not accurate for a specific person - we all know what you major in may not be what you end up doing&#8230;</p>
<p>What differentiates the two is Computer Science is just that - a science.  We generally take math through at least differential equations, four to six terms of a lab science, and specialize in fields like artificial intelligence, data structures, bioinformatics, and human-computer interaction.</p>
<p>In my eyes, IT is an applied skill set for computers and networks rather than a science.  Just as a carpenter works with wood or a plumber with pipes, an IT professional works with computers and networks. To quote <a href="http://en.wikipedia.org/wiki/Information_technology" target="_blank">Wikipedia</a>:</p>
<blockquote><p>IT professionals perform a variety of duties that range from installing applications to designing complex computer networks and information databases.</p></blockquote>
<p>IT majors generally concentrate on aspects of computing such as server technology, database design, database administration, networking, and basic software design.  I am by no means trying to make IT sound like a narrow-minded field - in fact, IT students may come out much more prepared for the &#8220;real world&#8221; than many Computer Science students.</p>
<p>Almost every company that utilizes computers, in any way, has IT.  They deal with everything from day-to-day operations of the computer systems and databases to manage major overhauls of company-wide networks.</p>
<p>Computer Science is entirely different.  Again to quote <a href="http://en.wikipedia.org/wiki/Computer_science" target="_blank">Wikipedia</a>:</p>
<blockquote><p>Computer Science [...] is the study and the science of the theoretical foundations of information and computation and their implementation and application in computer systems.</p></blockquote>
<p>Computer Scientists in a (very small) nutshell design algorithms and <em>create</em> software.  That is the key difference, in my opinion: Computer Science deals, in general, with software development and the heavily theoretical side of computing rather than (for lack of better words) the nuts-and-bolts of the computing world.</p>
<p>The operating system you are reading this from was most likely developed by computer scientists.  The browser you are reading this from was most likely developed by computer scientists.  The protocol that takes the 1&#8217;s and 0&#8217;s from half way across the world and converts into something you can read was most likely developed by computer scientists.  The software that NASA uses to get the Space Shuttle into orbit and back was most likely developed by computer scientists.  You get the point&#8230;</p>
<p>Generally speaking, Computer Scientists are employed by companies that need custom software developed or large applications that need constant additions or changes.  They are the heart of companies like Microsoft, Sun Microsystems, Google, and all the other big-name software companies out there.  IT on the other hand is a much more &#8220;professional oriented&#8221; field.  They are employed by a much broader range companies ranging from the software giants to banks to newspapers to&#8230;anything with a computer network.</p>
<p>Computer Science, however, is a much broader field than the general public understands and I could not even begin to go into every facet in this article.  What I want to make clear, however, is there is a distinct difference between IT and Computer Science.  Both are extremely valuable but their paths only cross on certain topics.</p>
]]></content:encoded>
			<wfw:commentRss>http://aaron-rosenfeld.com/2008/07/18/computer-science-vs-it/feed/</wfw:commentRss>
		<feedburner:origLink>http://aaron-rosenfeld.com/2008/07/18/computer-science-vs-it/</feedburner:origLink></item>
		<item>
		<title>10 PHP Tips I Wish I Had Known</title>
		<link>http://feeds.feedburner.com/~r/AaronRosenfeld/~3/329470439/</link>
		<comments>http://aaron-rosenfeld.com/2008/07/08/10-php-tips-i-wish-i-had-known/#comments</comments>
		<pubDate>Tue, 08 Jul 2008 04:02:44 +0000</pubDate>
		<dc:creator>Aaron Rosenfeld</dc:creator>
		
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://aaron-rosenfeld.com/?p=17</guid>
		<description><![CDATA[<p>I have been using PHP for about six years now and have never stopped learning.  These are a few things I wish I had known when I began programming in PHP.</p>]]></description>
			<content:encoded><![CDATA[<p>I have been using PHP for about six years now and have never stopped learning.  These are a few things I wish I had known when I began programming in PHP:</p>
<ol>
<li><strong>Quotes:</strong> Understand the difference between double-quotes and single-quotes.  I personally do not like the usage of variables in double quotes:
<pre>$var = "my var's value"</pre>
<p>I much prefer</p>
<pre>$var = 'my var\'s value'</pre>
<p>That is a personal preference but it really is extremely important to know when to use different types of quotes and understand escape sequences.</li>
<li><strong>Don&#8217;t get == confused with ===:</strong> The triple equals checks if the operands are identical &#8212; that is they have the same type with all of the same properties.  Example with string:
<pre>if('1' == 1){...} // evaluates to true
if('1' === 1){...} // evaluates to false</pre>
</li>
<li><strong>Use print_r liberally:</strong> If there is ever confusion when it comes to objects or arrays, print_r will come in extremely handy.  Also remember that print_r($var) will print the output as soon as possible (generally at the top of your page) where print_r($var, true) will return it.</li>
<li><strong>Make a database wrapper class:</strong> When you eventually delve into the mySQL (or any other database) world, there will be a ton of new functions to learn: mysql_connect, mysql_select_db, mysql_query, mysql_num_rows, mysql_fetch_array to name just a few.  Make it easy by making a single database class to wrap all of this together.  It will save you a ton of time and headache in the end.</li>
<li><strong>Avoid the &lt;? shorthand:</strong> Always use the full &lt;?php.  This can cause problems with PHP installations that do not have the shorthand enabled and also confuse some XML parsers.</li>
<li><strong>Never rely on <a href="http://www.php.net/manual/en/ini.core.php#ini.register-globals" target="_new">register_globals</a>:</strong> Always use $_GET, $_POST, and other pre-defined variables.  This guarantees compatibility with other PHP installations.</li>
<li><strong>Choosing a PHP Version:</strong> Use a PHP version you are comfortable with: PHP4 classes are incredibly simple to use.  There is no visibility keywords to worry about and functions can be called after being instantiated or statically.  PHP5 introduces the vast majority of the OOP paradigm, including visibility, the static keyword, class constants, method overloading, and class abstraction.  If you have never been exposed to this, it can be quite difficult to take in while also learning PHP.  That being said, it never hurts to use PHP5+ as it is fully backwards compatible.</li>
<li><strong>Modulize:</strong> After working with PHP for a couple of days or so, consider modulizing your commonly used functions, classes, etc.  This can be as simple as breaking them into files.  I found that I reuse the same code (albeit much cleaned up) that I did 3 years ago.</li>
<li><strong>Get into OOP as soon as possible:</strong> I remember the first year or two I was using PHP, I made maybe six websites for various organizations I was involved with.  The code was so horribly sloppy it still makes me sick to this day.  Do not get in the habit of mixing your PHP and HTML!  I cannot emphasize this enough!  Get in the habit of keeping the vast majority of code in included files and simply call function in the HTML itself.</li>
<li><strong>Utilize the PHP documentation fully:</strong> PHP, in my opinion, has bar-none the best documentation of any scripting language.  <a href="http://php.net/" target="_blank">php.net</a> contains a page dedicated to every function in PHP along with examples and alternatives for each.  At the bottom there is usually a wealth of user-submitted comments which provide even more information.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://aaron-rosenfeld.com/2008/07/08/10-php-tips-i-wish-i-had-known/feed/</wfw:commentRss>
		<feedburner:origLink>http://aaron-rosenfeld.com/2008/07/08/10-php-tips-i-wish-i-had-known/</feedburner:origLink></item>
		<item>
		<title>Protein Folding Simulation</title>
		<link>http://feeds.feedburner.com/~r/AaronRosenfeld/~3/327912070/</link>
		<comments>http://aaron-rosenfeld.com/2008/07/06/protein-folding-simulation/#comments</comments>
		<pubDate>Sun, 06 Jul 2008 07:42:41 +0000</pubDate>
		<dc:creator>Aaron Rosenfeld</dc:creator>
		
		<category><![CDATA[Computer Science]]></category>

		<category><![CDATA[Mathematics]]></category>

		<guid isPermaLink="false">http://aaron-rosenfeld.com/?p=14</guid>
		<description><![CDATA[<p>Given a sequence of amino-acids (sometimes given as DNA bases which are simply converted to codons and then amino-acids), what is the structure of a protein after a given time? This article investigates some methods used to figure out this 50 year-old problem.</p>]]></description>
			<content:encoded><![CDATA[<blockquote><p><strong>I wrote this article about a year ago for my old site and decided it was worthy of a re-post.</strong></p></blockquote>
<h2>What is the problem?</h2>
<p>Given a sequence of amino-acids (sometimes given as DNA bases which are simply  converted to codons and then amino-acids), what is the structure of a protein  after a time <em>t</em>? In addition to the general protein-protein forces, the  solution in which the protein is folding must be considered. Temperature, pH,  and electromagnetic properties all affect the final outcome. In general, however,  these other forces are held constant so only protein-protein forces need to be  calculated.</p>
<h2>General Approach</h2>
<p>The most intuitive approach is to re-calculate the protein&#8217;s shape at every  multiple of some interval of time (called Δ<em>h</em> in this case) until  the the <em>end condition</em> is reached. That is, starting at <em>t</em>=0 and  continue incrementing time by Δ<em>h</em> until we arrive at <em>t</em>=<em>t</em><sub>f</sub> (when the protein is completely folded).</p>
<p>The next obvious question is, &#8220;When is the protein done folding?&#8221; It turns  out, whenever the protein is in its lowest energy configuration, it will stop  adjusting its shape and come to a rest. Although I will omit the full derivation  of why this is the case, I will give another example of why this is true.</p>
<p>If a ball is released from rest at a height above the ground (really any height  above the centerpoint of the earth but for simplicity I will say above the earth)  it naturally falls towards the ground. This is because its (potential) energy  is relatively high. As it accelerates towards the ground, it loses this potential  energy (gaining kinetic energy). Once it hits the ground (ignoring any rebound),  both the kinetic and potential energy equals 0. This is the low energy (rest)  state of the ball. It will not move on its own. This  is the same for proteins. Once it is in its lowest energy state it will not &#8220;want&#8221;  to change its shape.</p>
<h2>Calculation Methods</h2>
<p><strong>Method One</strong><br />
The first method of simulating the protein&#8217;s folding is to simply determine the force on each atom and update its position from that.  The math behind this is relatively simple to understand. All that needs to be done is calculate the force of each atom on every other atom. Using Newton&#8217;s  basic laws of motion, the force acting on the <em>i</em><sup>th</sup> amino-acid is given by:</p>
<p style="text-align: center;"><img src="http://aaron-rosenfeld.com/files/protein/eq1.gif" alt="Equation One" /></p>
<p><strong>Method Two</strong><br />
Although the previous method works, a (somewhat) easier way to do these calculations  ignore the forces of each atom and instead uses the potential energy of the entire  system. Letting φ<sub>i</sub> be the potential energy of the system at the  <em>i</em><sup>th</sup> atom (as a function of all <em>n</em> that cause the potential  energy), the force on the <em>i</em><sup>th</sup> atom is given by:</p>
<p style="text-align: center;"><img src="http://aaron-rosenfeld.com/files/protein/eq2.gif" alt="Equation Two" /></p>
<p>As a side note, in case you are not familiar with what a gradient is, a full  description of it can be found on <a href="http://en.wikipedia.org/wiki/Gradient">Wikipedia</a> but as a brief overview, a gradient is a vector field where each vector points  in the direction of the largest increase of whatever is being measured (in this  case φ<sub>i</sub>). A gradient can be expanded to be the partial derivative  in each direction (in this case x, y, and z). Since this needs to be done numerically  in the end, the right side the following uses the <em>value</em> of each derivative at  the <em>i</em><sup>th</sup> atom:</p>
<p style="text-align: center;"><img src="http://aaron-rosenfeld.com/files/protein/eq3.gif" alt="Equation Three" /></p>
<p>This again goes back to the energy state discussion. Because the gradient of  φ yields the direction of increasing energy, going in the negative direction  of the gradient leads towards the lowest energy and an eventual end condition  for the protein&#8217;s folding.</p>
<h2>Energy of molecules</h2>
<p>In the previous section φ was used to represent the energy of a system.  In this section the actual value of the energy of molecule-molecule interaction  will be investigated. From over one hundred years research, a number of different  forces between atoms have been determined. Hydrogen-bonds, electrostatic interaction,  and Van der Walls attraction are the three most accepted. In addition to these  forces, the angle between atoms must be considered along with how much the bonds  can stretch, and how much the bonds can rotate along their axis (sometimes called their  Dihedral Angle Permittivity). The actual mathematics behind these forces are  studied in depth by chemists. Luckily, the potential energy each cause can be  reduced to nothing more than algebraic expressions. Note also that all include  some sort of coefficient (e.g. angle-bending coefficient, Dihedral Angle Permittivity  coefficient, etc.) and will be denoted with a <em>C</em> followed by a subscript.  To start, the total potential energy on the atoms in question is given by:</p>
<p style="text-align: center;"><img src="http://aaron-rosenfeld.com/files/protein/eq_u.gif" alt="Equation Energy" /></p>
<p>Expanding each term gives:</p>
<p style="text-align: center;"><img src="http://aaron-rosenfeld.com/files/protein/eq_full.gif" alt="Equation Energy Expanded" /></p>
<h2>Numerical Method</h2>
<p>The math above looks great on paper but when it comes to implementation, computers  don&#8217;t natively have &#8220;gradient&#8221;, &#8220;derivative&#8221;, or &#8220;integral&#8221;  functions. To solve this, Maclaurin (or Taylor) series expansion is generally  used. Again I will not go into detail about what a Taylor or Maclaurin series  is, <a href="http://en.wikipedia.org/wiki/Taylor_series">Wikipedia</a> has more  information, but as an overview, both series represent a function as the infinite  sum of its derivatives at a given point. The official notation for a Maclaurin  series is:</p>
<p style="text-align: center;"><img src="http://aaron-rosenfeld.com/files/protein/eq4.gif" alt="Equation Four" /></p>
<p>Note that for all of the functions in this article, this series actually <em>equals</em> the function itself after an infinite number of repetitions, not approximately equals. With this understanding, it is possible to use an integration algorithm such as the <a href="http://en.wikipedia.org/wiki/Verlet_integration">Verlet  algorithm</a> to propagate an atom&#8217;s position vector from <em> t = t</em> - Δ<em>h</em> to time <em>t = t</em> + Δ<em>h</em> without needing to know the velocity of  the atom. All that would be needed is the current position, time, and acceleration  which can be determined from the force. Again, the math of such an algorithm  will be omitted as it beyond the scope of this article.</p>
<h2>Computational Complexity</h2>
<p>Possibly the largest hurdle in protein-folding simulation is the sheer amount  of time the calculations discussed take to complete. All of the summations above  take a great deal of time. The two most intensive display O(<em>n</em><sup>2</sup>)  performance. This alone, when considering <em>n</em> is rarely under 100 and usually  in the tens-of-thousands, takes a very long time to complete. Nonetheless, these  calculations are minimal in comparison to the other option: trying every combination  of conformation to find the lowest energy level.</p>
<p>A principle known as the <a href="http://en.wikipedia.org/wiki/Levinthal_paradox">Levinthal  Paradox</a> shows just how intensive a process like this is. Because there are  three basic conformations for each amino-acid (α-helix, β-pleated sheet,  and the random coil), there are exactly 3<sup><em>n</em></sup> conformations possible  in a given protein with <em>n</em> amino-acids and a computational complexity of  O(3<sup><em>n</em></sup>). To put this in perspective, an amino-acid chain with  only 100 amino-acids has 3<sup>100</sup> conformations. That is just about <strong>5  * 10<sup>47</sup></strong> combinations that would have to be searched. Assuming this  was run on a home computer (mind you a very, very high performance home computer)  which can calculate 1,000 conformations per second, it would take just about  9.8 * 10<sup>31</sup> <strong>YEARS</strong>. So, as it also seems intuitively, doing the  calculations in previous sections is vastly more efficient than a random search of all combinations.</p>
<h2>A Brief Case Study</h2>
<p>To bring all of this article together, I decided to run the algorithm above  on a small protein. I eventually settled on Myoglobin for a few reasons. Myoglobin  is used by the body for a number of different functions such as a type of marker  of muscle injury and oxygen transportation, so it is readily available to be  studied. Additionally, it only contains 153 amino-acids making the calculations  relatively timely. It also has some bonding characteristics which are favorable  for reducing calculation time.</p>
<p>Running the calculations took 3 hours, 17 minutes, 34 seconds on a 21-processor  distributed computing platform. Some of this time was spent doing additional calculations to generate additional data for graphs that will be discussed next,  however.</p>
<p>For purposes of this article, at each conformation, the software calculated  the overall potential energy of the system which was then plotted vs. the position  in z and x*y locations of a certain atom (the exact atom was chosen because it  had the least number of bonds to minimize calculations). This method of plotting  had to be used because otherwise the plot would expand into a 4<sup>th</sup> dimension; x, y, z, and energy. An animated graph would be perfect for this, but for demonstration purposes truncating the 3 independent variables seemed acceptable.  So to allow all of the data to be shown in a  single image x and y were multiplied to give us only three variables: x*y, z,  and energy:</p>
<p style="text-align: center;"><img src="http://aaron-rosenfeld.com/files/protein/plot1.jpg" alt="Plot One" /></p>
<p>As the plot shows, at different values of x, y, and z the energy of the system fluctuates. As was stated previously, getting from one conformation to another involves using a gradient to determine which direction will lower the potential energy the most. The following shows this gradient (to be completely correct, it shows the magnitude of the gradient since the gradient itself is a vector valued function). The dark areas show the highest magnitude (which is negative since it is a minimum) and light show the lowest. As a side note, there is an overall gradient towards the very strong black dot (very strong vectors) that is faint and hard to see.</p>
<p style="text-align: center;"><img src="http://aaron-rosenfeld.com/files/protein/plot2.jpg" alt="Plot Two" /></p>
<p>As it should, the highest magnitude gradient is in the location of the global minimum. The software, in a sense, moved from the bottom left of this plot (since this atom can be considered the origin of the system) up, right, left, and down in the direction of the gradient until it hit the minimum when the atom was at its lowest energy state. In the end, as confirmed by Wikipedia, the structure looks like:</p>
<p style="text-align: center;"><img src="http://aaron-rosenfeld.com/files/protein/protein.jpg" alt="Protein Image" /></p>
<h2>What to make of it all</h2>
<p>I wrote this article to give others a perspective on how daunting the problem  of protein folding is. What I have explained is considered the easiest when it  comes to the mathematics. There are alternative approaches using energy levels  of atoms, more advanced vector-based equations, and more accurate methods of  describing the motion of atoms than has been explored here. But, the basic ideas  were covered here. I hope everyone found this article interesting and informative.  Any feedback is always appreciated.</p>
<h2>Citations</h2>
<p><strong>For Article</strong></p>
<ul>
<li><a href="http://www-wales.ch.cam.ac.uk/~mark/levinthal/levinthal.html">J. T. P. DeBrunner and E. Munck (1969). Levinthal&#8217;s Paradox. Retrieved January 26, 2008, from Levinthal&#8217;s Paradox Web site.</a></li>
<li><a href="http://en.wikipedia.org/wiki/Protein_folding">Protein folding. (2007, December 25). In Wikipedia, The free encyclopedia. Retrieved January 17, 2008.</a></li>
<li><a href="http://en.wikipedia.org/wiki/Molecular_dynamics">Molecular dynamics. (2007, December 29). In Wikipedia, The free encyclopedia. Retrieved January 16, 2008.</a></li>
<li>Myers J.K. and Pace C.N. (1996) Hydrogen bonding stabilizes globular proteins, Biophys. J. 71: 2033-2039.</li>
<li><a href="http://cmm.info.nih.gov/modeling/">National Institutes of Health (2001 January 01). NIH Center for Molecular Modeling. Retrieved January 26, 2008, from Center for Molecular Modeling Web site.</a></li>
<li><a href="http://www.princeton.edu/pr/pwb/99/0927/math.shtml">Steven, Schultz (1999 September 27). Math helps explain protein folding. Princeton Weekly Bulletin, 89, Retrieved September 25, 2008.</a></li>
</ul>
<p><strong>For Case Study</strong></p>
<ul>
<li>Higgins D., Thompson J., Gibson T. Thompson J. D., Higgins D. G., Gibson T. J.(1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting,position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.</li>
<li>Leckband, D. and Israelachvili, J. (2001) Intermolecular forces in biology. Quart. Rev. Biophys. 34: 105-267.</li>
<li><a href="http://en.wikipedia.org/wiki/Monte_Carlo_method">Monte Carlo Method. (2008, January 16). In Wikipedia, The free encyclopedia. Retrieved January 20, 2008.</a></li>
<li><a href="http://en.wikipedia.org/wiki/Myoglobin">Myoglobin. (2008, January 11). In Wikipedia, The free encyclopedia. Retrieved January 21, 2008.</a></li>
<li><a href="http://www.qtp.ufl.edu/Aces2/">University of Florida (2006 January 31). ACES II. Retrieved January 20, 2008, from ACES II Manual Release 2.0 Web site.</a></li>
<li><a href="http://skuld.bmsc.washington.edu/raster3d/raster3d.html">University of Washington, (2006 May 11). Raster3d. Retrieved January 26, 2008, from Raster3d Web site.</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://aaron-rosenfeld.com/2008/07/06/protein-folding-simulation/feed/</wfw:commentRss>
		<feedburner:origLink>http://aaron-rosenfeld.com/2008/07/06/protein-folding-simulation/</feedburner:origLink></item>
		<item>
		<title>Send and Receive SMS with PHP</title>
		<link>http://feeds.feedburner.com/~r/AaronRosenfeld/~3/326011109/</link>
		<comments>http://aaron-rosenfeld.com/2008/07/03/send-and-receive-sms-with-php/#comments</comments>
		<pubDate>Thu, 03 Jul 2008 18:07:35 +0000</pubDate>
		<dc:creator>Aaron Rosenfeld</dc:creator>
		
		<category><![CDATA[Servers]]></category>

		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://aaron-rosenfeld.com/?p=12</guid>
		<description><![CDATA[I have wanted a way to send and receive text-messages (SMS) with PHP for some time now.  The main problem isn&#8217;t that it is impossible &#8212; there are dozens of companies that provide PEAR or other gateways that do exactly this &#8212; it is the fact that I do not want to pay every [...]]]></description>
			<content:encoded><![CDATA[<p>I have wanted a way to send and receive text-messages (SMS) with PHP for some time now.  The main problem isn&#8217;t that it is impossible &#8212; there are dozens of companies that provide PEAR or other gateways that do exactly this &#8212; it is the fact that I do not want to pay every time I need to send a message.  I understand that there is no intrinsic connection between the internet and the cell-phone providers&#8217; wireless networks.  The companies that provide gateways between the two do so by having a GSM modem or another piece of physical hardware that sends the message &#8212; it is obvious why they charge.</p>
<p>What hit me today was kind of interesting: Most wireless companies provide an SMS gateway to their subscribers (a large list can be found on <a href="http://en.wikipedia.org/wiki/SMS_gateway" target="_blank">Wikipedia</a>).  This allows SMS messages to be sent via e-mail.  For example, if I wanted to send a message to a Verizon customer with the phone number 123-456-7890, I can e-mail it to 1234567890@vtext.com.  This is then forwarded, by Verizon, to the cell-phone.  That means, it is completely possible to setup a PHP script, using nothing more than the <a href="http://us.php.net/function.mail" target="_blank">mail() function</a>, to send a text message.</p>
<p>Although that thought really excited me, I then realized I still had to tackle the other side of the problem: How do I send a text message to my server?  This turned out to be a bit of a thought-process challenge but the implementation was very easy.</p>
<p>Most web hosts allow customers to setup mail forwarding.  What many people do not know is this forwarding is not limited to other e-mail addresses.  You can also &#8220;forward&#8221; (technically you &#8220;pipe&#8221; it) mail to a script.  How this is actually done depends on the web-server.  If Cpanel is installed, it is very simple.  Apache requires a module to do this.</p>
<p>With all of these things in my head, I setup the e-mail address sms@aaron-rosenfeld.com to pipe to the script |/path/myroot/sms.php (for the non-Linux type, the first character &#8220;|&#8221; is required.  That is why it&#8217;s known as the pipe!).  The script did nothing besides write to a log file &#8220;Message Received!&#8221; when it was executed (another side note: make sure the file being piped to has execute privileges and starts with #!/usr/bin/php -q).</p>
<p>I quickly sent a text-message from my phone to sms@aaron-rosenfeld.com and sure enough I had &#8220;Message Received!&#8221; in my log file!  To read the e-mail itself requires a bit more code:</p>
<pre>$handle = fopen('php://stdin', 'r');
$content = '';
while (!feof($handle))
	$content .= fread($handle, 1024);
fclose($handle);</pre>
<p>I won&#8217;t go into detail but this just reads all the data that was piped from the e-mail to the PHP script.  At the end, $content will store the entire e-mail including headers.  To parse all of this out into a more readable form I used the following function (where $fullText is the $content from the last code-snippet):</p>
<pre>$lines = explode("\n", $fullText);
$inMsg = false;
$this-&gt;message = '';
for ($i=0;$i&lt;sizeof($lines);$i++)
{
	if (!$inMsg)
	{
		$this-&gt;headers['all'][] = $lines[$i];
		if (preg_match("/^Subject: (.*)/", $lines[$i], $matches))
			$this-&gt;headers['subject'] = $matches[1];
		if (preg_match("/^From: (.*)/", $lines[$i], $matches))
			$this-&gt;headers['from'] = $matches[1];
	}
	else
		$this-&gt;message .= $lines[$i] . "\n";

	if (trim($lines[$i]) == '')
		$inMsg = true;
}</pre>
<p>This sticks the e-mail message in $this-&gt;message and the headers in their own array, $this-&gt;headers[].  From here the possibilities are endless: database queries, server status monitors, checking e-mail, etc.  I may post the full classes that I ended up using at some point but for now I though this would be adequate.</p>
]]></content:encoded>
			<wfw:commentRss>http://aaron-rosenfeld.com/2008/07/03/send-and-receive-sms-with-php/feed/</wfw:commentRss>
		<feedburner:origLink>http://aaron-rosenfeld.com/2008/07/03/send-and-receive-sms-with-php/</feedburner:origLink></item>
		<item>
		<title>Cryptography</title>
		<link>http://feeds.feedburner.com/~r/AaronRosenfeld/~3/326011110/</link>
		<comments>http://aaron-rosenfeld.com/2008/06/06/cryptography/#comments</comments>
		<pubDate>Sat, 07 Jun 2008 00:16:31 +0000</pubDate>
		<dc:creator>Aaron Rosenfeld</dc:creator>
		
		<category><![CDATA[Computer Science]]></category>

		<category><![CDATA[Mathematics]]></category>

		<guid isPermaLink="false">http://aaron-rosenfeld.com/?p=9</guid>
		<description><![CDATA[I wanted to talk about what I have learned from this book and spent the last few weeks discovering on my own. I’ll start with what most web-developers are used to: password storage. Any PHP or ASP programmer is familiar with the common one-way hash functions as MD5 and SHA. Both have undergone thorough mathematical benchmarks for reversibility and brute-forcing and are trusted by most everyone.]]></description>
			<content:encoded><![CDATA[<blockquote><p><strong>Update June 23<sup>rd</sup>, 2008: I recently gave a small talk on this subject which required a PDF format of this document.  This can be found <a href="http://aaron-rosenfeld.com/files/Encryption.pdf" target="_blank">here</a>.</strong></p></blockquote>
<p>So I recent bought an <a href="http://www.amazon.com/Modern-Cryptanalysis-Techniques-Advanced-Breaking/dp/047013593X" target="_blank">awesome cryptanalysis book</a> by Christopher Swenson that is bar-none the best computer science related text I have ever read.  I would recommend it to everyone who is interested in the topic of Cryptography or Cryptanalysis.</p>
<p>But that isn&#8217;t what I wanted to write about here.  I wanted to talk about what I have learned from this book and spent the last few weeks discovering on my own.  I&#8217;ll start with what most web-developers are used to: password storage.  Any PHP or ASP programmer is familiar with the common one-way hash functions such as MD5 and SHA.  Both have undergone thorough mathematical benchmarks for reversibility and brute-forcing and are trusted by most everyone.</p>
<p>The obvious drawback of these hashes are just that, though.  They are hashes; they are one-way.   For many this is fine since user login is simply a matter of comparing hashes.  However, password retrieval is impossible and is a feature many clients ask for.  Likewise, there are plenty of occasions where entire bodies of text need to be encrypted for transmission or storage purposes.</p>
<p>With that as the intro, I&#8217;ll start talking about the cryptography itself.  Note that I am by no means an authority on this topic.  I have taken two courses on cryptographic methods yet I am still far from understanding the ins and outs of such a diverse and rapidly-evolving field.</p>
<p><em>My notation: I will usually denote what base a number is in either with a subscript on the right side or use 0x as a prefix for hex or 0b for binary.  Assume base-10 if there is no indication.  All numbers are written with their most-significant value on the left, and if given a number N, N<sub>0</sub> is the least-significant bit (right-most), </em><em>N<sub>1</sub> is the second-most-least-significant bit (second right-most), etc.  I use parallel vertical bars to represent concatenation of two values.  That is:<img src="http://www.codecogs.com/eq.latex?\inline&amp;space;p&amp;space;\parallel&amp;space;q&amp;space;=&amp;space;pq" border="0" alt="\inline p \parallel q = pq" /><br />
</em></p>
<h2>Basics</h2>
<p>There are three basic requirements for any strong cryptographic algorithm: concealment, dispersion, and avalanching.  Concealment simply means, given some plaintext (input text) character it should appear different in the ciphertext (encrypted form).  This should seem obvious and I will append to this definition in our first example below.</p>
<p>Dispersion means that, given some string of input characters, the resultant ciphertext should not preserve the ordering of the input (even if they do meet the concealment requirement).</p>
<p>Many people consider avalanching part of dispersion but I like to keep them separate since they can each occur independent of the other.  Avalanching is when a very slight change in input (only 1 bit even) causes a large change in the ciphertext.  Take two MD5 hashes below.  Only one character has been changed in the input:</p>
<p style="text-align: center;">Hello world!: 86fb269d190d2c85f6e0468ceca42a20<br />
Hello wordd!: 5447cf2589c0fb0b119cea40f48b9a51</p>
<h2>Requirement One: Concealment</h2>
<p>I think the best way to understand the first requirement is to do a case study.  To make things easy, we will examine one of the most basic encryption method available: the exclusive-or.  If you are unfamiliar with what exclusive-or (XOR from here on) does, it takes two bit inputs and returns 1 if one <em>and only one</em> bit is 1. We denote this operation with a circled multiplication sign.  It can be more formally  described by <img src="http://www.codecogs.com/eq.latex?\inline p&amp;space;\otimes&amp;space;q&amp;space;=&amp;space;p\bar{q}&amp;space;+&amp;space;\bar{p}q" border="0" alt="\inlinep \otimes q = p\bar{q} + \bar{p}q" />. Logically, XOR can be applied to many bits by performing the operation bit-by-bit as in this example:</p>
<p style="text-align: center;"><img class="aligncenter" src="http://www.codecogs.com/eq.latex?101010_2&amp;space;\otimes&amp;space;111000_2&amp;space;=&amp;space;010010_2" border="0" alt="101010_2 \otimes 111000_2 = 010010_2" /></p>
<p>XOR is a very powerful operation because it is reversible.  That is, if a bit <em>p</em> is XOR&#8217;ed with a bit <em>q</em>, to give a resultant bit <em>y</em>, the bit <em>p</em> can be determined by simply XOR&#8217;ing y with <em>q</em> (or <em>q</em> can be found by XOR&#8217;ing <em>y</em> and <em>p</em>).</p>
<p>Example: We have a plaintext of 0&#215;61 0&#215;61 0&#215;72 0&#215;6F 0&#215;6E (&#8221;aaron&#8221; in hex), denoted <em>P</em> and we want to encrypt it with a single bit 0&#215;11, denoted <em>K</em>.  The result is:</p>
<p style="text-align: center;"><img src="http://www.codecogs.com/eq.latex?&amp;space;(P_4&amp;space;\otimes&amp;space;K)\parallel(P_3&amp;space;\otimes&amp;space;K)\parallel(P_2&amp;space;\otimes&amp;space;K)\parallel(P_1&amp;space;\otimes&amp;space;K)\parallel(P_0&amp;space;\otimes&amp;space;K)&amp;space;=" border="0" alt="\inline (P_4 \otimes K)\parallel(P_3 \otimes K)\parallel(P_2 \otimes K)\parallel(P_1 \otimes K)\parallel(P_0 \otimes K) =" /></p>
<p style="text-align: center;"><img src="http://www.codecogs.com/eq.latex?&amp;space;(0x61&amp;space;\otimes&amp;space;K)\parallel(0x61&amp;space;\otimes&amp;space;K)\parallel(0x72&amp;space;\otimes&amp;space;K)\parallel(0x6F&amp;space;\otimes&amp;space;K)\parallel(0x6E&amp;space;\otimes&amp;space;K)&amp;space;=" border="0" alt="\inline (0x61 \otimes K)\parallel(0x61 \otimes K)\parallel(0x72 \otimes K)\parallel(0x6F \otimes K)\parallel(0x6E \otimes K) =" /></p>
<p>This gives us the encrypted text 0&#215;70 0&#215;70 0&#215;63 0&#215;7E 0&#215;7F (the string is actually not printable in ASCII but that is irrelevant here).  To get back to the plaintext <em>P</em>, this encrypted text can just be XOR&#8217;ed with the key again.</p>
<p>For the average person, the encryption above is a pretty good way of mixing things up; going from &#8220;aaron&#8221; to &#8220;70 70 63 7F&#8221;.</p>
<p>Not only that, it definitely meets the first requirement we set previously; the text is concealed.  But I am now going to append to that definition: Concealment not only means that a given input character (or number) should be different after encryption, but all <em>instances </em>of that character in the ciphertext should not be the same (although they don&#8217;t <em>all</em> have to be different either).  In other words, the number of times a letter appears in plaintext should not be easy to determine simply by looking at the ciphertext.</p>
<p>Here is why: Lets look at the ciphertext.  There is one very obvious repetition in the ciphertext, the first two bytes, this directly correlates to the first two bytes that were being encrypted!  To a trained eye, this will indicate the encryption method <em>may </em>be weak, although this can happen by sheer luck with even very strong encryption methods.</p>
<p>The next logical question is &#8220;Who cares? There is still no way to get from 0&#215;70 back to 0&#215;61 without the key.&#8221;  This is somewhat true, but, it provides those trying to break the code (or obtain the key) crucial information to use to do something called <em>frequency analysis</em>.</p>
<h2>Frequency Analysis</h2>
<p>You may have played some of those cryptograms-puzzles in newspapers; most of them use basic methods of non-digital cryptography that are based on a simple lookup table (&#8217;a&#8217; in the ciphertext means &#8216;t&#8217; in the plaintext, etc.) or other pattern-based method.  These are fun but hold little practical value due to a number of cryptanalysis tricks developed over the years.  The main one that I want to go over, though is called <em>Frequency Analysis</em>.</p>
<p>Lets say we have a big block of ciphertext like the following (the spaces mean nothing and are simply there to break the text up):</p>
<blockquote><p>XFUIF QFPQM FPGUI FVOJU FETUB UFTJO PSEFS UPGPS NBNPS FQFSG FDUVO JPOFT UBCMJ TIKVT UJDFJ OTVSF EPNFT UJDUS BORVJ MJUZQ SPWJE FGPSU IFDPN NPOEF GFODF QSPNP UFUIF HFOFS BMXFM GBSFB OETFD VSFUI FCMFT TJOHT PGMJC FSUZU PPVST FMWFT BOEPV SQPTU FSJUZ EPPSE BJOBO EFTUB CMJTI UIJTD POTUJ UVUJP OGPSU IFVOJ UFETU BUFTP GBNFS JDB</p></blockquote>
<p>To most people this could be solved pretty easily by guessing but we are going to do something a bit more analytic (although guessing is still a big part of it).</p>
<p>The English language is very well documented.  Information on letter-frequency is readily as can be seen in this graph (Wikipedia):</p>
<p style="text-align: center;"><img class="aligncenter" src="http://upload.wikimedia.org/wikipedia/commons/thumb/4/41/English-slf.png/340px-English-slf.png" alt="Letter frequency for English." /></p>
<p>As you can see, there are some letters that are used quite a lot (&#8217;E&#8217; and &#8216;T&#8217;) and some that are used very rarely (&#8217;Z&#8217; and &#8216;J&#8217;).  We can use this information to make more educated guesses at our plaintext-to-ciphertext mapping.  I won&#8217;t list them all but the three highest frequency letters in our ciphertext are:</p>
<p style="text-align: center;">F: 39 (0.15)<br />
U: 29 (0.11)<br />
P: 25 (0.09)</p>
<p>The number after each letter is the number of occurances and the number in parenthesis is that number divided by the total number of characters (268).  So lets start by matching each of these up with their respective letter in the graph.  The most frequently occuring ciphertext letter is &#8216;F&#8217; (by a lot), and the most frequently occuring letter in English is &#8216;E&#8217; (by a lot) so we will assume &#8216;F&#8217; in ciphertext maps to &#8216;E&#8217; in plaintext.  Yes this is a guess, but a very logical one.  We will do the same for &#8216;U&#8217; and map it to the second-most-frequently occurring letter in English: &#8216;T&#8217;.  Same for &#8216;P&#8217; but note that, since &#8216;O&#8217; and &#8216;A&#8217; are both very close in frequency we should try both.</p>
<p>This process continues until either the cipher is broken or enough of a pattern is understood that the rest can be figured out.  From what three letters that we have mapped, &#8216;F&#8217; to &#8216;E&#8217;, &#8216;U&#8217; to &#8216;T&#8217;, and &#8216;P&#8217; to &#8216;O&#8217; (it could very well have been &#8216;A&#8217; and I had to try both), it may be clear that our cipher simply makes every plaintext letter the next letter in the alphabet.  After doing the substitutions and placing spaces properly we end up with:</p>
<blockquote><p>WE THE PEOPLE OF THE UNITED STATES IN ORDER TO FORM A MORE PERFECT UNION ESTABLISH JUSTICE INSURE DOMESTIC TRANQUILITY PROVIDE FOR THE COMMON DEFENCE PROMOTE THE GENERAL WELFARE AND SECURE THE BLESSINGS OF LIBERTY TO OURSELVES AND OUR POSTERITY DO ORDAIN AND ESTABLISH THIS CONSTITUTION FOR THE UNITED STATES OF AMERICA</p></blockquote>
<p>I know I skimmed over this example and made some assumptions but the idea is what is important.  In reality, a simple offset cipher like that is very simple to crack.  It is harder when letters are randomly assigned or frequencies don&#8217;t match up as well.  In those cases we can use the same exact technique but use the well-documented frequencies of two and three letter strings or turn to other techniques (such as examining different letters&#8217; <a href="http://en.wikipedia.org/wiki/Index_of_coincidence" target="_blank">Index of Coincidences</a>).</p>
<h2>Back to the Case Study</h2>
<p>As our detour has shown, <em>a strong cipher cannot preserve the frequencies of letters</em>.  In our first example the only reason this occurred, however, was because our key was just 1 character long.  Had it been two or more, the first &#8220;a&#8221; and second &#8220;a&#8221; would have been XOR&#8217;ed with different values thus yielding different ciphertexts.  Case and point: keys should not be one character long and should minimize the chance of a given letter encrypting to the same thing too often.</p>
<h2>Requirement Two: Dispersion</h2>
<p>Until now I have not addressed the second requirement: <em>dispersion</em>.  All the letters in the ciphertext, although in a different form, still represent the character in the equivalent position in the plaintext.  There are many easy ways of fixing this but many of them are not reversible or do not shuffle enough for real use. I will explain a couple methods of achieving basic levels of dispersion and will then discuss the primary method that most well-known ciphers use.</p>
<p>Going back to the first example, our plaintext was 0&#215;61 0&#215;61 0&#215;72 0&#215;6F 0&#215;6E and ciphertext calculated to 0&#215;70 0&#215;70 0&#215;63 0&#215;7E 0&#215;7F.  A simple method of shuffling is to simply rotate bits or bytes in a predefined pattern.  For example, we could move every character left one position to get 0&#215;70 0&#215;63 0&#215;7E 0&#215;7F 0&#215;70.  We could do this in any pattern we want as long as it is reversible (in reality it can be irreversible but for our basic methods of encryption we need to have an easy way of going back).</p>
<p>Another method is to shuffle the bits in each byte by a certain amount.  If you are unfamiliar with what bit-shifting is, it simple moves all bits left or right by a certain amount and we denote theseby &lt;&lt; and &gt;&gt; respectively.  It is interesting to note that each bit-shift multiplies the number by 2.  Here is an example:</p>
<p style="text-align: center;"><img class="aligncenter" src="http://www.codecogs.com/eq.latex?1001_2&amp;space;\ll&amp;space;&amp;space;1_2&amp;space;=&amp;space;10010_2&amp;space;\Leftrightarrow&amp;space;9_{10}&amp;space;\ll&amp;space;1_{10}&amp;space;=&amp;space;18_{10}" border="0" alt="1001_2 \ll  1_2 = 10010_2 \Leftrightarrow 9_{10} \ll 1_{10} = 18_{10}" /></p>
<p>Note that when we shift the binary number 1001 once, we just move every bit over to the left and add a 0 to the right side.  The equivalent is shown in base-10 on the right.  Assume all bit-shifts from here on are done in binary and then converted back to the base in which we are dealing.</p>
<p>This method is very well suited to disguise numbers (and characters by using their ASCII values) because it is seemingly random to the untrained eye.  If we did a single left-shift on each character in original ciphertext, instead of 0&#215;70 0&#215;70 0&#215;63 0&#215;7E 0&#215;7F we would have 0xE0 0xE0 0xC6 0xFC 0xFE.  One of the other methods that I personally like is shifting each byte its index plus 1.  That is, for the least-significant-byte is shifted 1 time, the second least-significant by 2, etc.</p>
<p>Another important note is some people do shifts in a different fashion.  Instead of shifting all the bits over by one, they literally rotate the bits.  For our example above:</p>
<p style="text-align: center;"><img class="aligncenter" src="http://www.codecogs.com/eq.latex?1001_2&amp;space;\ll&amp;space;&amp;space;1_2&amp;space;=&amp;space;0011_2&amp;space;\Leftrightarrow&amp;space;9_{10}&amp;space;\ll&amp;space;1_{10}&amp;space;=&amp;space;3_{10}" border="0" alt="1001_2 \ll  1_2 = 0011_2 \Leftrightarrow 9_{10} \ll 1_{10} = 3_{10}" /></p>
<p>Note that the leftmost bit is not dropped but wraps back around to the right side.  This is my preferred way of doing shifts because it guarantees the number of bits will remain unchanged. Note that I put the base-10 equivalent in just for reference.  The math behind rotational shifts like this gets a bit messy (although not too complex) and is really unimportant to us since we are simply using it to mask our original bits.</p>
<p>From now on when I talk about bit-shifts I am referring to this rotational method unless I not otherwise.</p>
<p>Bit shifts, byte shifts, and XOR&#8217;s can be combined in many different ways to make cracking a cipher much more difficult.  This is pretty impressive for a single-round encryption that has nothing but an XOR base to it.</p>
<h2>Requirement Three: The Avalanche Effect</h2>
<p>This the first place I will address the avalanche effect but we will develop the ideas more fully in the final section. Our encryption algorithms so far have basically just masked our input text and moved the bytes around. But imagine this: lets say we have a 1 round encryption scheme that XOR&#8217;s once with a key given by 0xA1 0&#215;12 0&#215;41 and then rotates all the bytes once. Now, imagine we feed it the plaintext 0&#215;01 0&#215;02 0&#215;03:</p>
<p style="text-align: center;"><img src="http://www.codecogs.com/eq.latex?(0x01&amp;space;\otimes&amp;space;0xA1)&amp;space;\parallel&amp;space;(0x02&amp;space;\otimes&amp;space;0x12)&amp;space;\parallel&amp;space;(0x03&amp;space;\otimes&amp;space;0x41)&amp;space;=&amp;space;0xA0&amp;space;\parallel&amp;space;0x10&amp;space;\parallel&amp;space;0x42" border="0" alt="(0x01 \otimes 0xA1) \parallel (0x02 \otimes 0x12) \parallel (0x03 \otimes 0x41) = 0xA0 \parallel 0x10 \parallel 0x42" /></p>
<p style="text-align: center;"><img src="http://www.codecogs.com/eq.latex?(0xA0&amp;space;\parallel&amp;space;0x10&amp;space;\parallel&amp;space;0x42)&amp;space;\ll&amp;space;1&amp;space;=&amp;space;&amp;space;\&amp;space;0x10&amp;space;\parallel&amp;space;0x42&amp;space;\parallel&amp;space;0xA0" border="0" alt="(0xA0 \parallel 0x10 \parallel 0x42) \ll 1 =&lt;br /&gt; \ 0x10 \parallel 0x42 \parallel 0xA0" /></p>
<p>Now imagine we do all of that the same except we feed it 0&#215;42 instead of 0&#215;41 for input.  Our output would be 0&#215;10 0&#215;41 0xA0.  Pretty close to the original isn&#8217;t it?  This can become a fatal flaw in an algorithm.  By trying many different combinations of inputs (or in this case, one), we can easily determine what output byte corresponds to what input byte.  From that, the key itself can be found by calculating the difference between the input and output bits in that byte.</p>
<p>The way to combat this is to assure bytes are dependent on many other bytes.  In other words, if we added another step to the encryption process that XOR&#8217;s every byte with every other byte, a change in any single byte will be propagated to all others and cause a much larger difference between two ciphertexts even if their plaintexts are very similar.</p>
<h2>Round-based Ciphers:</h2>
<p>Everything we have talked about until now is what&#8217;s called <em>single-rounded</em>.  An input is given to a function that performs some shuffling and distorting of the data and it then gives us a ciphertext all in one go.  The next important topic is using more than one round for encryption.  Entire books have been dedicated to this topic so I will barely scratch the surface but I will try and cover the general concepts.</p>
<p>A round based cipher essentially just does what we have been discussing a number of times.  Each <em>round</em> ciphertext is manipulated by a function (like what we have been talking about) and the output is used as the input for the next round.  What is unique about most round based encryption methods is they don&#8217;t use the same key every round.  Instead, a <em>key schedule</em> containing one key for each round is generated from the base key.  Many times, additional changes are made at the beginning and end of encryption based on the base key itself, however.</p>
<p>Until we get to the last section of this article, we must make the round function reversible.  Likewise, the key schedule must be generated in such a way that, given the same base key, all sub-keys will be the same.</p>
<p>To decrypt, the key schedule is regenerated and the rounds are run in reverse but with the round function this time un-doing whatever the original round function did.</p>
<p>What the problem is with this whole scheme is generally round functions are very complex.  They involve bit-shifting, byte-shifting, XORing, masking bytes with other bytes, and generally moving things around a lot.  Making all of this reversible is actually much harder than it sounds and, in many cases, impossible.  Here is an example.</p>
<p>Say I want to make a round function <em>F</em> that XORs the current input <em>P<sub>i</sub></em> with the current key <em>K<sub>i </sub></em>and then XORs itself with the original round input shifted by one byte<em> S(P<sub>i</sub></em>) (where <em>S</em> just denotes the shift function).  We know from the previous section this will promote an avalanche effect.</p>
<p>This seems to be a pretty good method but it cannot be reversed.  The general form for a given round <em>i</em> can be shown as (<em>P<sub>0</sub></em> is our starting plaintext and assume our key schedule has already been determined and the key for round <em>i</em> is denoted by <em>K<sub>i</sub></em>):</p>
<p style="text-align: center;"><img class="aligncenter" src="http://www.codecogs.com/eq.latex?P_i&amp;space;=&amp;space;P_{i-1}&amp;space;\otimes&amp;space;K_i&amp;space;\otimes&amp;space;S(P_{i-1})" border="0" alt="P_i = P_{i-1} \otimes K_i \otimes S(P_{i-1})" /></p>
<p>Thus, if we run this cipher for three rounds:</p>
<p style="text-align: center;"><img class="aligncenter" src="http://www.codecogs.com/eq.latex?\\&amp;space;P_1&amp;space;=&amp;space;P_0&amp;space;\otimes&amp;space;K_1&amp;space;\otimes&amp;space;S(P_0)&amp;space;\\&amp;space;P_2&amp;space;=&amp;space;P_1&amp;space;\otimes&amp;space;K_2&amp;space;\otimes&amp;space;S(P_1)&amp;space;\\&amp;space;P_3&amp;space;=&amp;space;P_2&amp;space;\otimes&amp;space;K_3&amp;space;\otimes&amp;space;S(P_2)&amp;space;\\" border="0" alt="\\&lt;br /&gt; P_1 = P_0 \otimes K_1 \otimes S(P_0) \\&lt;br /&gt; P_2 = P_1 \otimes K_2 \otimes S(P_1) \\&lt;br /&gt; P_3 = P_2 \otimes K_3 \otimes S(P_2) \\" /></p>
<p>Now, if we try and run that in reverse we have a problem.  Any given step requires the <em>previous</em> steps&#8217; output.  When we were encrypting we had access to this.  Going the other direction we do not.  This would work as a one-way hashing method but it fails for two-way encryption.</p>
<h2>Avalanche Effect and the Feistel Structure</h2>
<p>The way we get around this problem was discovered by Horst Feistel in the 1950&#8217;s.  His method, known as the Feistel Structure, is by far one of the most important algorithms in modern cryptology and is used by some of the big-names such as <a href="http://en.wikipedia.org/wiki/Data_Encryption_Standard" target="_blank">DES</a> and <a href="http://en.wikipedia.org/wiki/Blowfish_%28cipher%29" target="_blank">Blowfish</a>.  The beauty of his algorithm is any round function can be used, the round function does <em>not</em> have to be reversible, and it does a great job maximizing the avalanche effect in a minimal number of rounds.</p>
<p>The way it works is ingenious.  The plaintext is split into two equal pieces (halves) denoted <em>L</em> and <em>R</em>.  During encryption the right half is sent through the round function along with the round key.  That is then XOR&#8217;d with the left half.  That entire block is then used as the right half of the next round&#8217;s input and the left half of the current round&#8217;s input is used as the right.  It is easier to understand by reading these:</p>
<p style="text-align: center;"><img src="http://www.codecogs.com/eq.latex?\\&amp;space;L_i&amp;space;=&amp;space;R_{i-1}&amp;space;\\&amp;space;R_i&amp;space;=&amp;space;L_{i-1}&amp;space;\otimes&amp;space;F(R_{i-1},&amp;space;K_i)&amp;space;=&amp;space;L_{i-1}&amp;space;\otimes&amp;space;F(R_{i-1},&amp;space;K_i)" border="0" alt="\\&lt;br /&gt; L_i = R_{i-1} \\&lt;br /&gt; R_i = L_{i-1} \otimes F(R_{i-1}, K_i)" /></p>
<p>The most important thing to notice is that the second step does not involve anything from the previous step as in our last example.  This can be shown by rearranging the expressions above and changing the indices for easier understanding:</p>
<p style="text-align: center;"><a href="http://www.codecogs.com"><img src="http://www.codecogs.com/eq.latex?\\&amp;space;R_i&amp;space;=&amp;space;L_{i+1}&amp;space;\\&amp;space;L_i&amp;space;=&amp;space;R_{i+1}&amp;space;\otimes&amp;space;F(L_{i+1},&amp;space;K_i)&amp;space;\\" border="0" alt="\\&lt;br /&gt; R_i = L_{i+1} \\&lt;br /&gt; L_i = R_{i+1} \otimes F(L_{i+1}, K_i) \\" /></a></p>
<p>This means, to unencrypt we simply iterate from <em>i</em>=<em>n</em> down to <em>i</em>=<em>0</em> and we will arrive back at our plaintext!  This is quite amazing seeing that we can use <em>any</em> round function even if data is discarded since they are still preserved (in a different form) in the other half.  This diagram from Wikipedia shows the whole process very well:</p>
<p style="text-align: center;"><img class="aligncenter" src="http://upload.wikimedia.org/wikipedia/en/d/d2/Feistel.png" alt="" /></p>
<p>The method above is extremely important and is the basis for most all modern cryptographic algorithms.</p>
<p>That about wraps up what I wanted to cover.  I think most of the basics were covered and it gives a good foundation for further learning.  I plan on writing more in the future as my own skills progress and would love comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://aaron-rosenfeld.com/2008/06/06/cryptography/feed/</wfw:commentRss>
		<feedburner:origLink>http://aaron-rosenfeld.com/2008/06/06/cryptography/</feedburner:origLink></item>
	</channel>
</rss>
