2016 week 48 in programming

Crypto 101 - an introductory course on cryptography, freely available for programmers of all ages and skill levels

It tries to go through all of the major dramatis personae of cryptography to make TLS work in 45 minutes. This book is the natural extension of that, with an extensive focus on breaking cryptography.

How the Singapore Circle Line rogue train was caught with data

What we’d established was that there seemed to be a pattern over time and location: Incidents were happening one after another, in the opposite direction of the previous incident. Could it be something that was not in our dataset that caused the incidents? Imaginary lines connecting the incidents looked suspiciously similar to those in a Marey Chart. We then grouped all related pairs of incidents into larger sets using a disjoint-set data structure. Next, we calculated the percentage of the incidents that could be explained by our clustering algorithm. 189, 259, 0.7297297297297297)What it means: Of the 259 emergency braking incidents in our dataset, 189 cases - or 73% of them - could be explained by the “Rogue train” hypothesis. We coloured the incident chart based on the clustering results.

do {…} while (0) in macros

If you are a C programmer, you must be familiar with macros. If you don’t define macros carefully, they may bite you and drive you crazy. There are many this kind of macros which uses do while(0) in Linux kernels and other popular C libraries. Do while(0) is the only construct in C that lets you define macros that always work the same way, so that a semicolon after your macro always has the same effect, regardless of how the macro is used. It isn’t possible to write multistatement macros that do the right thing in all situations. You can’t make macros behave like functions-without do/while(0). In conclusion, macros in Linux and other codebases wrap their logic in do/while(0) because it ensures the macro always behaves the same, regardless of how semicolons and curly-brackets are used in the invoking code.

Let’s Stop Bashing C

The point the author makes about integer division is that it can confuse beginners. The reason for it is simple: when one is working with integers, one often has a problem related to integers in the first place. Performance is not even the point; floating-point numbers are much more complex than integers, and above all, they have different semantics. Bashing the increment and decrement operators became quite popular since Swift has taken the side of their omission. In particular, in C, it’s not equivalent with the postfix increment operator. With regards to semicolons: you can’t just interpret every newline as if it was a semicolon, because newlines become context-sensitive in this way. Treating newlines as statement endings is hard to remember correctly for humans, too, however appealing it might be at first glance.

Google’s new public NTP servers provide smeared time

None

Rust: 128 bit integers preparing to be released

None

Writing C without the standard library - Linux Edition

How do we find out which syscall puts uses? We can either lookthrough the syscall list, or simply install strace to tracesyscalls and write a simple program that uses puts. If you don’t know how todo something with syscalls, do it with libc and then strace it tosee which syscalls it uses on the target architecture. Oh no! The “Write” function is part of the standard library!How do we invoke syscalls without having to link the standard lib? Takes the syscall number followed by either pointers or integersas parameters- sets rax to the syscall number- sets rdi, rsi, rdx, r10, r8 and r9 to the parameters. Syscall numbers are usually named SYS followed by the syscall nameYou can also add the -m32 flags to check values for 32-bit. Legacy syscalls on i386###################################################################There are a few things you should be extremely careful with whendealing with syscalls, especially when targeting multiplearchitectures. Here’s our fixed stat struct:———————————-typedef uintptr dev t;typedef intptr syscall slong t;typedef uintptr syscall ulong t;typedef uintptr time t;. typedef struct.

The Practice of Code Review

This process can go by different names: code review, peer review, programmer quality assurance, and probably a few others. Code review is a skill that takes a lot of time and practice to become proficient, and it takes even longer to master it. At first you’re more likely going to be the one having your code reviewed. Working with production code is very different than stuff learned in a classroom. Code reviewers will quickly tell you if your code is hard/impossible to follow. How much thought do you put into naming conventions? Do you use nondescript variable names like a, b, c? That won’t fly with code review. If you don’t have someone to review your code, find someone.

Let’s Stop Copying C

C is fairly old - 44 years, now! - and comes from a time when there were possibly more architectures than programming languages. Alas, the popularity of C has led to a number of programming languages’ taking significant cues from its design, and parts of its design are slightly questionable. I’ve listed some languages that do or don’t take the same approach as C. Plenty of the listed languages have no relation to C, and some even predate it - this is meant as a cross-reference of the landscape, not a genealogy. In languages with C-like header files, most headers include other headers include more headers, so who knows how any particular declaration is actually ending up in your code? Oh, and there’s the whole include guards thing. I think it’s considerably worse in a statically typed language like C, because the whole point is that you can rely on the types. The language explicitly leaves type concerns in my hands. So if you’re designing a language, don’t just copy C. Don’t just copy C++ or Java.

I made this: a multiplayer game where you code to play

None

Arduboy, the game system the size of a credit card. Create your own games, learn to program or download from a library of open source games for free!

Arduboy is a miniature open-source game system that you can program yourself! If you don’t know how to program yet, it’s a great way to learn! If you already are an expert developer, join our community and show off your skills! Arduboy is a game system the size of your imagination! Recently funded on Kickstarter and on display at Maker Faires around the world, everyone loves Arduboy!

Visualizing How Developers Rate Their Own Programming Skills

On average, developers rate themselves 7.09 / 10. Employing traditional regression analysis to build a model for predicting programming ability would be tricky: does having more experience cause programming skill to improve, or does having strong innate technical skill cause developers to remain in the industry and grow? We can easily confirm that a positive correlation exists between programming activity and experience, with newbie developers rating their skills 5.02 / 10 on average, and extremely experienced developers rating their skills three whole ranks higher at 8.13 / 10. What’s also notable is the range of values selected: for developers with less than 1 years of experience, the distribution is almost completely flat between 1-7, showing that they are more honest with the self-assessment of their programming skills. Do freelance / contract developers believe they are better programmers than full-time developers? What about repository commit activity by developers? Are developers who commit more better? One could argue that a developer who commits code often is either vigilant with accounting for functional code changes, or polluting the codebase in an attempt to show productivity. Are developers who use Stack Overflow as a resource better developers who know how to properly use external references in times of crisis, or are they developers who use it as a crutch to compensate for weak coding skills?

A curated awesome list of lists of interview questions.

None

Undocumented Instructions in Production - Why some NES games use undocumented 1-byte and 2-byte NOPs

The instruction $89 on the 6502 is a two-byte NOP. Based on adjacent instructions in the opcode matrix, especially LDA #ii, it would have been STA #ii, a store to an immediate value, which makes no sense. On the 65C02, this instruction is changed to BIT #ii, which almost behaves as a two-byte NOP. One hypothesis is that a programmer working on both NES projects and projects for some 65C02-based system forgot that the original 6502 lacked BIT #ii, but because the instruction does so little anyway, the programmer didn’t notice any difference. A clockslide is a is a sequence of instructions that wastes a small constant amount of cycles plus one cycle per executed byte, no matter whether it’s entered on an odd or even address. With official instructions, one can construct a clockslide from CMP instructions: … C9 C9 C9 C9 C5 EA:. Disassemble from the start and you get CMP #$C9 CMP #$C9 CMP $00EA. Disassemble one byte in and you get CMP #$C9 CMP #$C5 NOP. A calculated start address into a clockslide can be used with indirect jumps or LDA highbyte PHA LDA lowbyte PHA RTS) to precisely control timing, such as when playing PCM audio or sending video register changes to the PPU in a raster effect. CMP has a side effect of destroying most of the processor status flags, but unofficial instructions that skip one byte can be used to preserve them. As LOIS 16192 mentioned, the official NOP instruction can be inserted at random places in a particular subroutine that isn’t an inner loop. It adds even more entropy to use unofficial NOPs, two-byte NOPs, or two-byte NOPs that read the zero page.

OpenAI Universe

Universe exposes a wide range of environments through a common interface: the agent operates a remote desktop by observing pixels of a screen and producing keyboard and mouse commands. The environment exposes a VNC server and the universe library turns the agent into a VNC client. Universe includes browser-based environments which require AI agents to read, navigate, and use the web just like people - using pixels, keyboard, and mouse. Universe agents must deal with real-world griminess that traditional RL agents are shielded from: agents must run in real-time and account for fluctuating action and observation lag. Pong is one of the easiest Atari games, but it had the potential to be intractable as a Universe task, since the agent has to learn to perform very precise maneuvers at 4x realtime. While solving Universe will require an agent far outside the reach of current techniques, these videos show that many interesting Universe environments can be fruitfully approached with today’s algorithms. Each of these agents uses the same code and hyperparameters as the Flash game agents.

Introducing Amazon Lightsail

You get the simplicity of a VPS, backed by the power, reliability, and security of AWS. As your needs grow, you will have the ability to smoothly step outside of the initial boundaries and connect to additional AWS database, messaging, and content distribution services. A Quick TourLet’s take a quick tour of Amazon Lightsail! Each page of the Lightsail console includes a Quick Assist tab. Advanced Lightsail - APIs and VPC PeeringBefore I wrap up, let’s talk about a few of the more advanced features of Amazon Lightsail - APIs and VPC Peering. CreateInstances - Create one or more Lightsail instances. All of the Lightsail instances within an account run within a “Shadow” VPC that is not visible in the AWS Management Console. If the code that you are running on your Lightsail instances needs access to other AWS resources, you can set up VPC peering between the shadow VPC and another one in your account, and create the resources therein. You can now connect your Lightsail apps to other AWS resources that are running within a VPC. Pricing and AvailabilityWe are launching Amazon Lightsail today in the US East Region, and plan to expand it to other regions in the near future.

The cyber Swiss Army knife

There are well over 100 operations in CyberChef allowing you to carry simple and complex tasks easily. CyberChef encourages both technical and non-technical people to explore data formats, encryption and compression. Digital data comes in all shapes, sizes and formats in the modern world - CyberChef helps to make sense of this data all on one easy-to-use platform. For those comfortable writing code, CyberChef is a quick and efficient way to prototype solutions to a problem which can then be scripted once proven to work. It is expected that CyberChef will be useful for cybersecurity and antivirus companies. It is hoped that by releasing CyberChef through GitHub, contributions can be added which can be rolled out into future versions of the tool. There are around 150 useful operations in CyberChef for anyone working on anything vaguely Internet-related, whether you just want to convert a timestamp to a different format, decompress gzipped data, create a SHA3 hash, or parse an X.509 certificate to find out who issued it.

SQL injections vulnerabilities in Stack Overflow PHP questions

None

Parsing C++ is literally undecidable

Many programmers are aware that C++ templates are Turing-complete, and this was proved in the 2003 paper C++ Templates are Turing Complete. The C++ FQA has a section showing that parsing C++ is undecidable, but many people have misinterpreted the full implications of this. Some people misinterpret this statement to simply mean that fully compiling a C++ program is undecidable, or that showing the program valid is undecidable. This line of thinking presumes that constructing a parse tree is decidable, but only further stages of the compiler such as template instantiation are undecidable. Simply producing a parse tree for a C++ program is undecidable, because producing a parse tree can require arbitrary template instantiation. Struct SomeType ; template <…> struct TuringMachine ; template struct S ; template<> struct S ; int x; int main(). The parse tree itself depends on arbitrary template instantiation, and is therefore the parsing step is undecidable.

Dolphin Progress Report: November 2016

On the 22nd of November, Marcan of Hackmii.com released the Homebrew Channel as an open source application, removed the anti-emulation hooks, and fixed a few bugs in Dolphin so that it could run properly! Nearly a decade after its inception, the homebrew channel is finally emulated! Admittedly, it has very little use for Dolphin it is extremely cool to have. Some packed Wii homebrew elfs will only run in Dolphin from the homebrew channel currently. One more thing of importance: If you are running Linux and have a Sandy Bridge, Ivy Bridge, Haswell, or Broadwell CPU, you may be having some weird performance issues with Dolphin. In order to make the highest quality video dumps from within Dolphin, users will want to use the built-in framedump features. There are people who use Dolphin’s Android builds, and some very high end devices can handle Dolphin in some situations. While Dolphin will currently invalidate UIDs on new builds, this is not necessary and with the proper infrastructure they will only be invalidated on actual shader generation changes. While it’s been rather maligned by users and developers alike, it remains the most complete UI for Dolphin.

What’s new in Git 2.11?

Git diff -submodule=short displays the old commit and new commit from the submodule referenced by your project: git diff -submodule=log is a bit more useful, and displays the summary line from the commit message of any new or removed commits in the updated submodule: Git 2.11 introduces a third option: -submodule=diff. Git stash show 1$ git stash apply 1$ git stash pop 1And so forth. Git LFS reduces your repository size by using a clean filter to squirrel away large file content in the LFS cache, and adds a tiny “Pointer” file to the Git object store instead. Smudge filters are the opposite of clean filters - hence the name. The Git LFS smudge filter transforms pointer files by replacing them with the corresponding large file, either from your LFS cache or by reading through to your Git LFS store on Bitbucket. As of Git 2.11, smudge and clean filters can be defined as long running processes that are invoked once for the first filtered file, then fed subsequent files that need smudging or cleaning until the parent Git operation exits. Git cat-file -filtersAnother small improvement for users of Git LFS and other filter-based extensions is the new -filters option for the git cat-file command. Png happens to be tracked with Git LFS, and git cat-file skips the Git LFS smudge filter, so rather than getting our image back we just get the contents of the Git LFS pointer file.

Get ready for Advent of Code!! :)

None

A whole mess of documentation aggregated in an easy to read, searchable site, with offline mode.

None

No excuses, write unit tests

Unit testing can sometimes be a tricky subject no matter what language you’re writing in. There’s fear unit testing will take time your team doesn’t have Your team can’t agree on an acceptable level of test coverage or get stuck bike-shedding* People are frustrated by breaking tests when changing code. Unit testing your code takes some extra time upfront, because of course, you need to write extra code - the tests. If tests did not break, and that code went out to production, now everywhere else the code was used is now broken, you’ve got 99 problems but lucky you, testing ain’t one. This rule can be applied to unit testing in a number of ways, but the most useful I’ve found is to first write the code to make the thing work, preferably small functions and then write a test for it. After you’ve been going writing tests for a while, you should start to notice more things you change will break existing tests. In my experience, I see no reason good enough not to at least have some unit testing.

Zero-cost abstractions

Iterating over it with a for loop implicitly creates an iterator, however in the case of Range that iterator is the structure itself. Iter() constructs a slice iterator, and the call to zip implicitly constructs an iterator for the buffer slice as well. Zip and map both wrap their input iterators in a new iterator structure. Sum repeatedly calls next() on its input iterator, pattern matches on the result, and adds up the values until the iterator is exhausted. Iterators also introduce extra control flow: zip will terminate after one of its inner iterators is exhausted, so in principle it has to branch twice on every iteration. The input slices have a fixed size of 12 elements, and despite the use of iterators, the compiler was able to unroll everything here. In this post I’ve shown a small snippet of code that uses high-level constructs such as closures and iterator combinators, yet the code compiles down to the same instructions that a hand-written C program would compile to.

comments powered by Disqus