Log in

No account? Create an account
Mark Gritter's Journal
[Most Recent Entries] [Calendar View] [Friends]

Below are the 20 most recent journal entries recorded in Mark Gritter's LiveJournal:

[ << Previous 20 ]
Thursday, March 22nd, 2018
10:13 pm
Solving the "ghost" word game.
Ghost is played as follows: First player names a letter. Each subsequent player adds a letter to the front or back of the word fragment so far. If a player names a word of at least four letters, she loses. If a player is challenged to name a word containing the fragment he produced, and cannot, he loses (otherwise the challenger loses.)

Example play: H -> GH -> GHO -> NGHO -> ONGHO -> challenge -> STRONGHOLD

Who wins with optimal play in two-player Ghost?

I wrote some code to find out. You can see it here: https://gist.github.com/mgritter/60514d8591b62cfc08f8fdf0bf020f11

SolutionCollapse )
Thursday, December 21st, 2017
5:36 pm
Predictions for 2018
Tintri asked me to put together some technology predictions for 2018. You can access the "on-demand webinar" here: https://info.tintri.com/Predictions-Webinar-Series-On-Demand-JumpPage.htm, or wait for the blog entry, or ask for spoilers. :)
Friday, November 3rd, 2017
12:49 pm
Here's the letter I sent my representative about just one part of the R tax plan:

Representative Lewis,

The recently released tax reform legislation proposes to tax the tuition waivers that graduate students receive as part of their teaching and research work. Currently up to $5,250 in tuition per year is exempt from taxation.

I am writing to express my concern about the impact this will have on students who can least afford it, and on education in Minnesota and the United States. I was a graduate student at Stanford, before working for a startup company and ultimately returning to Minnesota, where I had done my undergraduate studies. The paycheck I received for teaching assistantships and research assistantships was far less than I could have been earning in industry. It was a struggle to find housing that was affordable and yet close enough to campus that it didn't interfere with my studies. While I was not living in poverty, many students with families found it much harder to make ends meet.

I, at least, had low student debt and the prospect of high-paying employment in my field of study. I was also attending a prestigious university and a department that was well-funded.
A decade later, costs of attending college have gone up. Students who form the bulk of college and university teachers for the next generation face very difficult choices.

For a concrete example, Stanford's current minimum rate for a TA is $9,630 per quarter for half-time work. (The other half is, of course, dedicated to classes and research.) At $38,520/year under the Republican tax plan, the student would owe taxes on $22,320 salary and $10,260 of tuition waivers. At the 12% bracket this comes to $5,853 in federal income tax. More than one-fifth of that, $1,231, is due to tax on money that the graduate student never sees: it is paid by the university, to the university. $1000 is a lot of money to a graduate student, even at Stanford.

Adding to the tax burden of those at the very start of their careers, who may just be starting their families, can only discourage people from graduate education. But Minnesota needs a next generation of PhDs to train its students in science, mathematics, and the humanities. Our educational strength is a key driver of economic growth.

Mark Gritter
Tuesday, August 1st, 2017
1:58 am
Selecting Strings in Solidity
For a personal project implementing procedurally generated content on Ethereum, I would like to find the most efficient way to accomplish the following in Solidity:

if ( n == 0 ) {
buf.append( "unicorn" );
} else if ( n == 1 ) {
buf.append( "raven" );
} else if ( n == 2 ) {
buf.append( "sparrow" );
} else if ( n == 3 ) {

Solidity puts string literals in an inefficient binary encoding in the compiled format. In memory, there is one 256-bit integer for the string length, and then as many 32-byte memory chunks as are necessary to hold the string. In the compiled form this is somewhat better with a compact representation of the length, but a 256-bit big-endian number representing the string:

/* "src/testchoice.sol":790:813 buf.append( "unicorn" ) */
0x7 <== length
0x756e69636f726e00000000000000000000000000000000000000000000000000 <== string literal
/* "src/testchoice.sol":790:793 buf */
/* "src/testchoice.sol":790:813 buf.append( "unicorn" ) */
/* "src/testchoice.sol":790:800 buf.append */
/* "src/testchoice.sol":790:813 buf.append( "unicorn" ) */
jump // in

The next obvious thing to try is a literal array. This would have some disadvantages for the final design, but it might be more efficient in gas usage:

string[11] memory animals = [ "unicorn", "raven", "sparrow", "scorpion",
"coyote", "eagle", "owl", "lizard",
"zebra", "duck", "kitten" ];
buf.append( animals[n] );

And a somewhat less-obvious route is to efficiently pack the strings into a single literal, and then extract the portion that I need. (I have to implement my own string append function anyway.)

string memory animals = "unicornravensparrowscorpioncoyoteeagleowllizardzebraduckkitten";
if ( n == 0 ) {
buf.appendPacked( animals, 0, 6);
} else if ( n == 1 ) {
buf.appendPacked( animals, 6, 5 );
} else if ( n == 2 ) {
buf.appendPacked( animals, 11, 7 );

How do they compare? I compiled each version in isolation as part of a contract, and tested each for gas usage using Remix's Javascript VM (at https://ethereum.github.io/browser-solidity/)

nested if: code size 4460 bytes, gas costs 22710, 22548, 22602

static array: code size 4198 bytes, gas costs 24682, 24682, 24682

packed literal: code size 3676 bytes, gas costs 22798, 22879, 22744

The packing makes a big difference on code size, and it runs nearly as efficiently as the original. The static array is significantly more expensive, though it does provide some code size benefits.
Sunday, June 11th, 2017
12:41 am
Plotting the Barnsley Fern in 1812?
The play "Arcadia" features a precocious young woman discovering fractal geometry (and the second law of thermodynamics in an amazing two-fer.) A contemporary mathematician explains that actually drawing a strange attractor by hand would require years and years of effort. Here's a sample fractal we can use for comparison, the Barnsley Fern: https://en.wikipedia.org/wiki/Barnsley_fern It's described by four linear transformations, which require four multiplications and up to six additions per application.

Logarithms are not a 19th-century invention. They're a 17th-century invention. So anybody who can add by hand can multiply quickly as well, through the use of a published book of logarithms. Because the ability to plot accurately by hand is limited too, we don't need very many digits of accuracy.

Here's the Barnsley Fern with just 2000 data points, clearly recognizable as a plant shape.


So how quickly could our 19th-century heroine compute these 2000 points? Slide rules had already been invented (though their modern form had to wait until 1859.) Sadly, I cannot come up with a good reference for the actual speed of human calculation on, say, 8-digit numbers such as Kepler performed. If we allow Thomasina a leisurely 1 minute per lookup, then she can perform about 7.5 iterations per hour of work. 2000 iterations are then 267 hours, or a mere seven modern work-weeks. Challenging to accomplish, but not impossible either for a dedicated worker.
Saturday, March 11th, 2017
8:51 pm
A Chomsky Hierarchy Puzzle
http://tracery.io/ is a tool to generate text. Its input is a mostly context-free grammar, and it outputs random instances of the language defined by that grammar.

Why only mostly context-free? Tracery has two special features. The first is that the outcome of a rule expansion can be saved for later, as a new rule.

So, for example, the grammar

  "origin":"[a:#letters#]#a# #a# #a#",

defines "letters" in a standard way as the regular language a, aa, aaa, aaaa, ... But when we expand "origin" we take whichever string of 'a's got generated and bind it to the new rule "a". So "origin" produces the language "a a a", "aa aa aa", "aaa aaa aaa", ...

Moreover, Tracery includes modifiers that can be applied to the expansion of a rule. Most are things like "capitalize the first letter" or "add -s or -es as appropriate." But Tracery 2 (the dev version, unfortunately not the one with the live editor) includes a "replace" modifer in the standard implementation. That lets us take the previous example a step further:


Now this language is definitely not context-free. In fact, it's the canonical example of a context-sensitive language.

Here's the puzzle: is Tracery, or Tracery-with-replace, powerful enough to implement any context-sensitive language? How could we show that? If not, what is the class of languages that it can implement?
Saturday, February 18th, 2017
4:26 pm
The Puppy Functor
We introduce a finite category Pup. A standard representation is the objects {inside, outside} and in addition to the identities there are two morphisms that form a bijection:

door: inside -> outside
door -1: outside -> inside

The literature conventionally refers to these operations as "let the dog out" and "let the dog in", respectively.

A "puppy functor" P is defined as follows, and is applicable only to categories in which products exist:

P(S) = S x ( inside or outside )

P( 1S : S->S ) = f x 1

P(f : S->T ) = f x ( door or door-1 ) for all other morphisms

In the resulting category, any action must be accompanied by letting the dog outside or letting the dog back inside, as no function maps to the inside or outside identity. Further, any bijective morphism involves letting the dog out and then back in again (or the reverse), because

P(1) = P( ff -1 ) = P( f )P( f -1 )

Categories with endomorphisms (other than the identity, of course) cannot have a puppy functor. In fact, the only categories which have a puppy functor are those whose graphs are bipartite. The practical aspect of this restriction implies that dog owners cannot expect to perform any third task between letting the dog out and letting the dog back in.
Sunday, February 12th, 2017
2:56 am
Back from Tintri sales kickoff
Tintri has historically held its sales kickoff in the Bay Area to be close to headquaters. This year we went to Las Vegas instead, at the Hard Rock Hotel and Casino. I don't like going to Las Vegas; among the reasons is cigarette smoke and the Hard Rock is no exception.

The Hard Rock's theme meant guitars everywhere. Literally propped up on some of the food serving tables.

Tintri's outside speakers this year were a little different. We had a performance by Butterscotch (a beatboxer) and talks by Vancouver social entrepreneur Mark Brand and filmmaker Nirvan Mullick, of "Caine's Arcade" fame. We participated in a Cardboard Challenge where we made games out of cardboard (and other items, like leftover party favors from the previous night.) You can see my team's two-person maze in action here: https://goo.gl/photos/Z66EprrvTFrugjfA8

Mark Brand made us all promise to leave the world better than we found it. The only other thing I can clearly remember from his speech, unfortunately, is that people keep wanting to show up and cook food for the homeless despite not actually knowing how to cook. I think he also had a few anecdotes about testing one's assumptions, and about not becoming a reality TV star.

Hard Rock has three poker tables which are often empty. They have a nightly 8pm $70 poker tournament that usually doesn't run. But the Tintri federal team showed up en masse so we had a fun time losing to one of the Tintri sales engineers. (A few other tourists showed up as alternates, and one of them chopped with Ronnie.) I busted in fourth after hitting an ace out of position against a set.

Tintri had a good year and there is encouraging news for the year to come. We won a partner award from CDW and they hope to expand their Tintri business, which is potentially very big. I harangued the Central region sales team to make more use of me--- last year I visited Wisconsin and Seattle, and had calls with customers in Chicago and Florida, but didn't meet any Minnesota, Iowa, Nebraska, or Missouri customers.
Saturday, January 28th, 2017
9:14 pm
Peter Thiel
Before the election, I thought that Peter Thiel's support for Trump was just Silicon Valley "disruptive" thinking at its worst. If the system isn't working, let's try something different (without a lot of thought as to whether it's actually better.) After all, Peter Thiel would not be the one suffering, so from his perspective it would be a risk-free experiment. He even tried making the case that we should take Trump seriously but not literally.

Foolishly, I did not realize that billionaires always have a hedge in place. In Thiel's case, it's citizenship in New Zealand, which he somehow in 2011 obtained despite not fulfilling a five-year residency requirement.
Friday, January 27th, 2017
3:30 pm
Retiring another interview question
I've been talking to people who have built cloud services that communicate with on-premises equipment (like storage arrays or routers.) One of my questions has been: "can you give me an example of some difficulties, or customer requirements, that you've encountered about the device-to-cloud communication channel?"

The clearest sign of somebody who has not deployed to production is that they don't know of any, or the best they can come up with is "use a web proxy."

Here's my list, which I'm pretty sure is incomplete.

1. Some customers want to know the exact IP address and port you will connect to, and hard-code it in their firewall. Want to change cloud service providers? Forget it. Want to scale via DNS load balancing, or add a second server in Europe? Are you really going to go back and tell all those customers to open up more holes? Probably not.

2. Customers want you to use an HTTPS proxy. (No problem, everybody expects that.)

3. The HTTPS proxy is username and password protected. (No big deal.)

4. There are multiple ways to provide the username and password to the proxy. Are all of them supported? (Oops, we didn't turn on GSSAPI/Kerberos until a customer specifically requested it.)

5. Some customers install a proxy that spoofs HTTPS, so you get a certificate mismatch. Does your code give up at that point, or ignore the error and do application-level authentication and application-level encryption? (If so, what is the point of HTTPS? Well, some customers won't support HTTP going out.)

6. Some proxies support HTTPS but not WebSockets.

7. The customer wants you to use a web proxy, and also to configure that web proxy to only allow your box access to one server.

8. An isolated storage network is standard. Some customers put the admin interface on a separate network too. (This is not too different from having no access at all.)

9. Some proxies have maximum connection lifetimes or maximum request sizes configured.
Wednesday, January 25th, 2017
10:31 pm
Generalized Fermat Through Brute Force
The generalized Fermat equation Ax^a + By^b = Cz^c has many integer solutions if 1/a + 1/b + 1/c > 1. It has rather fewer in the opposite case. I collected a bunch of references in this Quora question. The case a = 3, b = 2, c= 7 has been completely solved and there are a few "large" solutions including 9262^3+15312283^2=113^7.

Suppose you're not deep into analytic number theory and want to do a brute-force search for some more examples. A natural question to ask is whether some cases can be eliminated by a simple criteria that reduces the search space.

Take x^3 + y^11 = z^2, for example. If we calculate mod 4, then working through the 16 possibilities gives us only 7 cases that work:

y = 0 : x in { 0, 1 }
y = 1 : x in { 0, 2, 3 }
y = 2 : x = 1
y = 3 : x = 1

That means we can partition the search space and do only 7/16ths of the work, excluding all cases shown to be impossible mod 4. How well can we do with this approach? (In some cases we can get all the way to zero and prove impossibility! But not here.)

The farther out I search, the incrementally better "filters" I find. Calculating the possibilities mod 720 reduces the search space to about 4.3% of the possibilities. Improvements come at moduli 8, 12, 16, 24, 48, 72, 96, 120, 144, 240, 336, 480, and 672.

Now, I'm sure if I actually knew any analytic number theory this would be the point to say "aha! L-functions" or something like that. But perhaps there's an algebraic number theory reason for this sequence which some thought could discover.

When it comes time to do the brute force search, my favorite technique is Daniel Bernstein's heap-based approach which give us the candidate x^3 + y^11 values in order. Without using any of the above, I can tell you that there are no nontrivial solutions with |X| <= 1000000 and 0 < y < 300.
Saturday, January 21st, 2017
6:05 pm
I don't fucking care what happens to Random Nazi Dude. He can piss off. We don't have to understand or dialogue with him. The media doesn't need to cover him. I don't care if people disrupt his interview. I fully support removing Nazis from your events, schools, organizations, and workplaces.

What I care about is that my Twitter feed is full of people who think political violence should be cheered and encouraged.

You know, just like Trump does. Remember that? From the campaign?

When we talk about "not normalizing Trump" it should mean "not normalizing his worldview." Not setting ourselves up as the mirror image of his fascist supporters.

There are times when it's correct and necessary to kill in defense of others. I'm willing to accept that even violent protest is necessary to correct injustice. Punching somebody whose views you find odious doesn't meet that standard.
Friday, January 20th, 2017
2:03 am
Wave Function Collapse for Automated Doodling
I've been playing around a little with https://github.com/mxgmn/WaveFunctionCollapse (actually with the C++ port here: https://github.com/emilk/wfc). The algorithm is pretty impressive at randomly generating large bitmaps from small examples. And the animated gif's of it in action are an inspired presentation.

But... my first attempt to use it failed entirely, because I picked a very complicated bitmap which, even after simplification, caused the algorithm to engage in lengthy calculations that produced blobby output.

So I tried something simpler, with a pattern I doodle in many of my notebooks:

I was a little surprised to find that 3x3 tiles just couldn't seem to produce a successful example:

It works better with 4x4 tiles, though the success rate is still low:


2x2 works great but doesn't preserve the closed paths. What's going on here? Why is this pattern challenging for the WaveFunctionCollapse algorithm?

One thing to note is that WFC doesn't backtrack. It heuristically picks the most-constrained (lowest-entropy) tile to place next--- in other words, it's basically a greedy search. For many tile sets, greedy search is fine. But when there is a higher-order constraint like a closed loop, some choices might not work out at all. (Or, it might just be something dumb like I need to tell it that my tiles are symmetric. But I don't think so.)

Is there any good way to characterize tile sets for which greedy search always works? Mostly works?
Tuesday, January 3rd, 2017
9:25 pm
Mathematics in "The Just City"
Because I'm reading a history of mathematics with lots of Greeks in it (including Aristotle questioning whether parallel lines really exist) I was inspired to reread "The Just City" looking for math. Now, Jo is not a mathematician, but there are a few tantalizing hints.

The city residents certainly do math, although astronomy is the only subject specifically mentioned. Interestingly, they have a Keplerian orrery showing elliptical orbits, presumably in a heliocentric model. This seems the sort of thing that would fit in well, and doesn't require any fundamental innovations beyond the conic sections Hellenistic mathematics was well acquainted with. (You can use hyperbolas to trisect angles, for example.) But the practice of astronomy is tricky.

Would the Just City use Greek numerals? They are somewhat cumbersome for high-precision arithmetic, though of course the Greeks did pretty well (and had abaci.) But compared to place-value notation they are far from ideal, and so shouldn't a city that strives toward excellence adopt the ahistoric Indo-Arabic numerals?

Similarly, astronomy requires a lot of precomputed tables of trigonometric functions to do well. The ancients had such tables--- but we know that many of them were buggy. Excellence ought to require using a modern table, but would the masters be willing to accept one? (Particularly since it would require allowing modern numerals.)

But if you're going to allow a modern place table, are you going to forbid the modern analysis which constructed it? Ancient techniques were very clever but again, cumbersome in practice. Even leaving aside the workers and the sciences involved in electricity, many of the masters would be conversant with Euclid's parallel postulate and the philosophical debate surrounding it. Would they really prefer that the library of the city not have the answer?

Jo is clear that philosophy in the City has a cut-off date. No moderns (and certainly no Christians.) History ends with the Battle of Lepanto in 1571. It's less clear to me where you draw that line in a principled way for mathematics. Of course, the Just City does not shy from unprincipled ways of making decisions, and practical concerns mean that much of mathematics that the masters might allow is in practice inaccessible, just because nobody among the masters understands and can teach it. But Kepler, who was born in 1571, used mathematics in a way that is so different from the Hellenistic tradition, and it would be difficult to include his work without including his foundation.

In fact, Simmea learns the calculus, which dates to the late 17th century. From a philosophical standpoint, this is particularly troubling as the logical problems of infinitesimals were well-understood at the time (and certainly could be apparent to anyone trained in the Socratic tradition.) They were only resolved in the 19th century with modern real analysis. I don't think this material could be included without bringing in aspect of modern philosophy as well, even though most practicing mathematicians default to a rough-and-ready Platonism.
Sunday, January 1st, 2017
12:54 pm
Learning Swift via Advent Of Code
This year I participated in the Advent of Code contest in a new programming language, Swift. New both to me and objectively! The language is still in active development. I posted my solutions on GitHub: https://github.com/mgritter/aoc2016 (Technically only 24 programs, because one day I used Wolfram Alpha instead of writing any Swift code.) Because I got started late and used a new language, I wasn't at all competitive for the leaderboard.

The online language guide and other online resources made it pretty easy to get up to speed and start programming, and the increasing complexity of AoC made a good fit, although I feel like I peaked in using new language features about halfway through. One of the downsides of a new language that is still willing to make changes is that many of my searches returned content about Swift 2, not Swift 3. Using Linux was also somewhat of a challenge because a lot of the "getting started" guides are more Xcode-specific. The interactive mode (REPL) helped try language features out, but I never did figure out how to import my modules in order to test my functions interactively.

What I liked about Swift:

Terse but clear syntax. The language removes a lot of cruft from C-style languages. "If" statements don't need parentheses, "switch" statements don't need breaks, closures are easy to write. Type inference means you're getting the benefit of type checking without spending a lot of time writing type names. The range operators ... and ..< beat any C++ or Python equivalent.

C#-style attributes work pretty well too, although I found their usage within the standard library inconsistent: stylistically what was an attribute and what should be a function was not always clear to me.

Swift makes it very easy to distinguish constants ("let") from variables ("var") and I found this pleasing to write, but I'm not sure how helpful it ultimately was. It allows a very powerful 'switch' for enumerating cases, not just value-matching, which I used pretty heavily but probably could have used more.

Type System. There's a lot of things right here. Enumerations are first-class types and can carry extra values. Interface definition via prototypes works pretty well. Structs and Classes make an entity/value distinction you can't get in Python or Java.

The most obvious innovation here is that types can be decorated as "optional": Int becomes Int?, MyClass becomes MyClass?. Optional types permit a 'nil' value and must be explicitly unwrapped (using either an operator or a special conditional form.) The "flatMap" function can be used to simultaneously transform and filter a collection by giving it a function of the form X->Y?, where the 'nil's get discarded, for example:

let program = input.lines.flatMap { Instruction.parse( $0 ) }

(The second part is a closure, and parentheses on the flatMap function call are optional in this common case.)


Buggy implementation, particularly on the Linux port. I filed a couple bugs. One basically prevented me from exploring the regular expression library, since it wasn't usable in the REPL. I also encountered a known issue in type inference that fortunately had a workaround. The CharacterSet class (ported from NSCharacterSet) has a ton of filed bugs and a comment to the effect of "there sure are a lot of these, what are we going to do about it?"

Getting started was hard because the helpful "read a file into a String" function was one that has not yet been ported, so I had to find a (itself buggy) Gist to help me understand how to use Glibc and convert the raw bytes into a String.

Poor tuple support. Tuples work OK at basic stuff, and the ability to do named tuples is promising. Switching on a tuple works pretty well. But compared to Python, forget it. There's no easy way to do list/tuple conversion. Tuples aren't hashable by default, and for AoC I kept wanting to put a tuple as a dictionary key. And the absence of a 3-way zip led to this monstrosity:

return zip( zip( left, center), right ).map { x,y in (x.0,x.1,y) }

There were some libraries I could have downloaded that offered zip3, zip4, zip5, etc. as separate functions.

Whitespace matters for operators. I tried to stay away from this but I got bit a couple times by mistake. You can define new operators, which is cool. But "a + b" and "a +b" are not the same, the latter is a prefix "+". The rules are:

The whitespace around an operator is used to determine whether an operator is used as a prefix operator, a postfix operator, or a binary operator. This behavior is summarized in the following rules:

If an operator has whitespace around both sides or around neither side, it is treated as a binary operator. As an example, the +++ operator in a+++b and a +++ b is treated as a binary operator.

If an operator has whitespace on the left side only, it is treated as a prefix unary operator. As an example, the +++ operator in a +++b is treated as a prefix unary operator.

If an operator has whitespace on the right side only, it is treated as a postfix unary operator. As an example, the +++ operator in a+++ b is treated as a postfix unary operator.

If an operator has no whitespace on the left but is followed immediately by a dot (.), it is treated as a postfix unary operator. As an example, the +++ operator in a+++.b is treated as a postfix unary operator (a+++ .b rather than a +++ .b).

Love/Hate features

Unicode support. They have a very clear explanation of why Unicode-compliant string parsing is hard, and their String and Character classes are clearly written to encourage you to do things correctly. Plus you can use Unicode for identifier names (and not just the boring ones, you can use emoji too.) Unfortunately I couldn't figure out if Emacs really supported this.

The downside is that String manipulation is hard for Advent-of-Code style problems where the input is usually ASCII but you're doing a lot of parsing. Again, maybe if the regexp library were usable for me some of this could have been simpler. But by far the grossest thing I wrote was this code to rotate the letters a-z:

        switch ch {
        case "a"..."z":
            let scalars = String( ch ).unicodeScalars
            let val = scalars[scalars.startIndex].value - scalar_a
            let valInc = scalar_a + ( val + by ) % 26
            return Character( UnicodeScalar( valInc )! )

Often I ended up working with array of characters "[Character]" rather than String, which was a reasonable compromise I think. A friend tried to argue on Twitter that this made you think about the cost of some of the operations you were doing that were expensive on a Unicode string.

Extensions. You can add methods and attributes to any existing class. Even the standard library. "import Foundation" adds a lot of functionality to String and Array. I wrote this which I used over and over:

extension String {
    public var lines : [String] {
        get {
            return self.components( separatedBy: "\n" ).map { $0.trimmingCharacters( in:.whitespaces ) }

".components" is already a String extension added by Foundation (which was confusing at first because I didn't realize I had to "import Foundation" to get all the documented String functions.)

This can also be used to break up your implementation into smaller pieces where a single class definition would be hairy, for example if you want to define some utility classes in a place that "make sense". If you define a new protocol or want to make an existing type hashable, the mechanism works really well.

From a software engineering standpoint, I find this potentially disastrous. I don't entirely understand all the namespace semantics (it's there, and usually unobtrusive) but there doesn't seem to be any way to resolve conflicts in extensions, like there is for class names. I used a hashing library that added a ".md5()" function to String. That's great if there's only one such library in use.

The idiom here is "don't pollute the global namespace" which is good but it seems like you end up polluting every other namespace instead.

Package manager

Mostly this was OK, it "just works". I was easily able to reference external packages. For internal modules, though, I felt like it could do more, and it was hard to figure out the basics of "how do I use a library module in my main program." I ended up with this:

    Target( name: "AocMain", dependencies: ["SimpleFile"] ),
    Target( name: "day1", dependencies: [ "AocMain" ] ),
    Target( name: "day2", dependencies: [ "AocMain" ] ),
    Target( name: "day4", dependencies: [ "AocMain" ] ),

The first target is a library, that depends on another library. The remaining targets are executables. The difference? The day1, day2, day4, ... subdirectories contain a file named "main.swift". Easy to set up yet... I didn't feel very good about it.

It feels like there should be some automatic dependency analysis going on. If I "import Foo" and "Foo" is another library in the same package, shouldn't it just work? And is it linking the external packages with every target? I didn't investigate, but it doesn't feel right to me.

Named Parameters: Swift is showing its Objective-C roots here. If you have a function

doSomething( to:Nail, with:Hammer) -> Woodworking

then you call it like this:

let result = doSomething( to:myNail, with:borrowedHammer )

You can make the named parameters optional, but it's pretty clear this is the desired idiom and sometimes it's used to good effect. But often I end up writing what feels redundant, like:

parse( input:input )
Coordinate( x:x, y:y )

so I don't know if this is my problem and I should make better choices, or a mechanism that could be improved.

The compiler tries to be helpful if you misspell a named parameter, a variable name, or use an overloaded function that's not there. But often it's only helpy, particularly if you have imported Glibc with its ton of symbols.

I enjoyed learning the language but since I'm not doing any IOS programming I doubt I'll stick with it. Probably I will learn Go or Rust next, and keep Python as my prototyping language.
Thursday, December 1st, 2016
11:35 pm
Biting the Hand That Feeds Me
The National Venture Capital Association has some advice for Donald Trump: don't tax us. The letter is long on generalities except where tax issues are concerned. While most of the policy issues are ones to which I'm at least a little sympathetic, I find their presentation self-serving and overly optimistic, rather than really engaging with any complexity.

Since the Reagan Administration, our tax code has been relatively effective at encouraging patient, long-term investment, but on net has been hostile to entrepreneurial companies. For example, punitive loss limitation rules punish startups for hiring or investing in innovation, while benefits such as the R&D credit are inaccessible to startups. Unfortunately, tax reform conversations in Washington have ignored these challenges while at the same time proposing to raise taxes on long-term startup investment to pay for unrelated priorities.

For instance, carried interest has been an important feature of the tax code that has properly aligned the interests of entrepreneurs and venture investors since the creation of the modern venture capital industry. Increasing the tax rate on carried interest capital gains will have an outsized impact on entrepreneurship due to the venture industry’s longer holding periods, higher risk, smaller size, and less reliance on fees for compensation. These factors will magnify the negative impact of the tax increase for venture capital fund formation outside of the traditional venture regions on the coasts.

No startup founder cares about loss limitation rules. The tax code has never been a factor in whether we hire or invest in R&D. Why? Because startups lose money. It only matters later, when the startup starts making money, that the company doesn't get as big a deduction for its previous losses. (Those banked losses are an asset... but any startup for whom that is material is failing or doing it wrong.)

That's just a leadup to the NVCA's real concern: Trump's stance against the "carried interest loophole". VC's get paid 2-and-20 like other fund managers. 2% of fund assets annually (sometimes negotiated downwards) and 20% of fund income. The fund manager's 20% is counted as long-term capital gains because it is treated as their "share" of the multi-year investment rather than just a management fee.

There are some reasonable arguments that this arrangement is the proper way to think of VC compensation, but VCs often fall back on the "we won't do it if we're taxed higher than 15%" line which is in fact what NVCA says in their letter. This is nonsense. What else are they going to do? Work for salary and pay the same tax rate they're complaining about, only for less money? Engage in less risky investments and let somebody else get the big payoffs instead? Become hedge fund managers and live off the 2% management fee? (oh, wait, that's taxed as normal income too.)

Complaints about the IPO process being burdensome are fair, but it's not clear that is what's driving companies to stay private longer. NVCA is not clear at all what they'd like to do about it.

Debate during the election often focused on illegal immigration, but unfortunately not how legal immigration can create jobs for American citizens, including in underemployed areas. This can be accomplished by attracting the world’s best entrepreneurs to our country through creation of a Startup Visa and allowing more high-skill immigrants to help build startups.

Disingenuous. Startup-founding immigrants have little incentive to go to (and stay in) the Rust Belt.

Basic research: I'm in favor of that too, but the request here is also self-serving. The more applied research gets performed in universities, the less risk for venture-backed firms to develop it from zero. And in fact that's exactly what they want:

In addition to funding basic research, encouraging the commercialization of more technology will lead to increased job creation and economic growth. Many states with economically distressed areas can better utilize research universities to spread startup activity. High-growth startups frequently come out of university-funded research that is commercialized and become private companies.

This is a subsidy, and should be judged against other forms of public investment. (If that's the only way we can get more funding for public universities, I'm for it!)

Unfortunately, while our competitors have upped their game, American policymakers have taken our leadership in entrepreneurship for granted, as we detail below. Consequently, whereas twenty years ago U.S. entrepreneurs received approximately 90 percent of global venture capital investment, that share has fallen over time to only 54 percent in 2015. Further, in three of the last four years, at least half of the top ten largest venture investments in the world have occurred outside the U.S.

How much leadership in venture investments is "enough"? 75%? 90%? 95%? This relative comparison is meaningless. Venture funding in the U.S. has not dropped. The rest of the world is just growing faster. That's a good thing, and possibly inevitable. When China has a lot of cash, it's going to make a lot of investments.

Partner with states to spread startups in areas of economic distress. Your administration has an incredible opportunity to bolster public-private partnership economic development efforts to spur entrepreneurship. Ohio is one state that is leading the way to transform their economy and create opportunity. Third Frontier in Columbus is providing access to business expertise, mentorship, capital, and talent to turn great ideas into growing companies, and JumpStart in Cleveland provides equity-based capital to help startups grow, matches talented individuals with growing companies, and provides expertise to startups.

This is both helplessly vague and disingenuous at once. These seed efforts mean little unless coastal VCs are willing to cut checks for mid-stage and late-stage capital. That's happening somewhat. But the language used here is often code for "investment tax credits" which I feel are of questionable value. The bigger issue is that the Rust Belt often lacks the worker base to benefit from the technology startups we're talking about here. Or to put it another way, the jobs that are created are ones where the workers are already in high demand, not substitutes for the displaced manufacturing jobs. If Colombus creates a 1000 new programming jobs, they probably will act more to stop brain drain than to help rural Ohio.
Monday, November 28th, 2016
11:53 pm
More Sol LeWitt
A good article on Sol LeWitt which explains more about how the wall drawings are implemented in practice: http://www.greatwhatsit.com/archives/1379 You "buy" a wall drawing from its current owner; they paint it over and you get a draftsman to re-create it on your wall. It comes with a certificate of authenticity. (There is sometimes a diagram as well. A better picture of one can be found here: http://www.museoreinasofia.es/sites/default/files/salas/informacion/104.03_sol_lewitt_eng.pdf)

The important part of this process, of course, is that it preserves uniqueness. But it doesn't close the door on rogue implementations of the initial recipe. Here's Wall Drawing #47, although it is not on a wall and comes with no certificate, so of course it is not Wall Drawing #47 at all:

Here's my WaterColorBot working on a pen-and-paper version:

Thursday, November 24th, 2016
2:28 am
Two days with the WaterColorBot
I have a new toy; it's Super-Awesome Sylvia's WaterColorBot from Evil Mad Scientist Labs. This is basically a pen plotter that knows how to use a watercolor paint palette. It can be used with crayons or colored pencils or dip pens instead of watercolors, too.

My first attempt showed some limitations of the RoboDraw software, which takes a SVG and attempts to paint it:

I made heavy use of clip paths in trying to recreate this Sol LeWitt wall drawing, and RoboDraw 0.98 doesn't appear to obey them. The beta 2.0 software couldn't paint my SVG at all but from as far as it got, it looked like it had the same problem. Here's a rendering from the same program I used to create that SVG:

However, the Inkscape plugin for WaterColorBot has far better control (at the cost of some more complexity.) You draw each color as a separate layer in Inkscape, and the paths you defined there are exactly what the brush follows. Leaving only the complexity of judging fills, appropriate amount of water, distance between paint refills, height of the brush, and quality of the paper. Generating SVGs from Python code and then painting them in Inkscape works pretty well. Here are two crayon drawings I did via this method, both inspired by Sol LeWitt:

I was able to use Inkscape to convert a small piece of pixel art to a PNG, and then paint the PNG in RoboDraw. Prewatering the paint and using printer paper led to a rather soggy output, but the conversion process worked pretty well and the fills were more sensible here.

Converting a Tintri logo to an outline in Inkscape and plotting that with a colored pencil worked quite well. (But I didn't take a picture of that one.) The hardware seems rather inexact with brush strokes but quite accurate on pencil drawings--- not sure what's going on there. It's capable of drawing pretty well-formed circles though with a slight mismatch where the curve closes.

The control software makes choices that are a little surprising sometimes. It doesn't paint like a human, of course, who would refill at the end of a stroke. Instead it will quit halfway through when the distance threshold has been met, or start a stroke and then immediately go for more paint. The fills are also done in scan lines when a human would tend to break them up. There is a low-level API so if sufficiently motivated I could try my own hand at doing better.

A limitation of the simple hardware is that there is no feedback on the amount of pressure on the brush pencil. The brush mechanism has software-controlled height, and they even include software to draw spiral paintings where the height of the brush is used to print an image. But you have to calibrate the height by hand, and it's easy to lift the carriage up by putting a pen too low. On the other hand, simply resting on the paper doesn't provide good results with a pencil or crayon. The guide rails are fairly simple and open at the top, so the amount of pressure is limited, but it still would be nice to have a pressure sensor to tell if the crayon needs to be lowered a bit to keep drawing. (You can see in the first crayon drawing above that I paused and adjusted the height a couple times.)
Saturday, November 19th, 2016
9:32 pm
Sol LeWitt and Conventions
Some of my favorite pieces on a recent SFMOMA visit was Sol LeWitt's conceptual art. Unfortunately the wall drawings did not photograph well, so here's something that LeWitt actually executed himself, "Forms Derived from a Cube."

LeWitt's wall drawings are meant to be ephemeral, and the ones I saw on display were going to be painted over later. They are given as instructions for a draftsperson to execute. You can find examples at http://massmoca.org/sol-lewitt/ or http://www.ericdoeringer.com/ConArtRec/LeWitt/LeWitt.html

The instructions are somewhat terse. For example, Wall Drawing #104 is simply "Arcs from the midpoints of two sides of the wall." But, it looks like most executions do not actually engage with the freedom this provides. They instead more or less follow the initial interpretations and the conventions implied by other drawing instructions. For example, the standard interpretation is that the drawing is black unless specified otherwise. The draftsman ensures that lines or arcs are evenly spaced unless they are "not straight" or explicitly "random", and that they intersect the boundaries of the figure. This is an "originalist" reading that says we follow LeWitt's earliest examples if the text is ambiguous.

Unless, of course, you let the museum visitors try their hand. Then you get variety: http://risdmuseum.org/manual/45_variations_of_a_drawing_sol_lewitt_and_his_written_instructions Of course, some of this variety is "I want to draw Snoopy rather than following these stupid instructions." (Even the examples there assume that the basic topology of parallelogram-inside-circle is specified. But is it?) Which is more in the spirit of "conceptual art"? The RISD museum decided that the text was insufficient:

These differences made clear that despite LeWitt’s desire for objectivity, a great deal of subjectivity exists when others are involved in the making. LeWitt certainly was aware of the variations that might occur, and although he might have accepted that his instructions could be drawn a number of ways, he did have preferences as to how his works were installed. That’s why, rather than using the written instructions as the sole guide for installing the wall drawing at RISD, the Museum’s installation crew was assisted by draftspeople from the Sol LeWitt Foundation who provided a diagram to help LeWitt’s idea take form.
Tuesday, October 25th, 2016
8:42 pm
Principal Component Analysis
A similar technique to NNMF from my last entry is Principal Component Analysis. It still factors the input matrix, but now negative values are allowed. A big challenge for using PCA is what to do with those negative numbers, which have no standard interpretation. (This is one of the advantages of NNMF, that it can be applied in scenarios in which only positive values make sense.)

PCA is better at identifying features in my sample of pixel art, and degrades more gracefully as the number of components is lowered. The graphs here show original, reconstructed, and the PCA components separated by palette color. (Blue is negative, red is positive--- sorry.)

Reduced dimension (10 components.) Using nearest-neighbor of the blended colors works pretty well at reconstructing the original images:

However, the blends between flowers are, if anything, worse. Nearest-neighbor doesn't work very well at all. The best results I got were from capping the negative contributions at 0 and blending:

[ << Previous 20 ]
My Website   About LiveJournal.com