This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(ns dna) | |
(def nucleotides #{\A, \T, \C, \G, \U}) | |
(defn nucleotide-counts [strand] | |
(let | |
[count | |
(fn [counted-nucleotides nucleotide] | |
(assoc | |
counted-nucleotides | |
nucleotide | |
(+ (get counted-nucleotides nucleotide) 1)))] | |
(reduce count {\A 0, \T 0, \C 0, \G 0} strand))) | |
(defn count [nucleotide strand] | |
(if (contains? nucleotides nucleotide) | |
(get (nucleotide-counts strand) nucleotide 0) | |
(throw (Exception. "invalid nucleotide")))) |
The code is very similar to the one for the Word count exercise. The main difference is that here I used a set to validate that the given nucleotide is valid.
I eliminated the duplication between the set and the map contents in this second version:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(ns dna) | |
(def dna-nucleotides #{\A, \T, \C, \G}) | |
(def nucleotides (conj dna-nucleotides \U)) | |
(defn nucleotide-counts [strand] | |
(let | |
[counted-nucleotides | |
(zipmap dna-nucleotides (repeat (count dna-nucleotides) 0)) | |
count | |
(fn [counted-nucleotides nucleotide] | |
(assoc | |
counted-nucleotides | |
nucleotide | |
(+ (get counted-nucleotides nucleotide) 1)))] | |
(reduce count counted-nucleotides strand))) | |
(defn count [nucleotide strand] | |
(if (contains? nucleotides nucleotide) | |
(get (nucleotide-counts strand) nucleotide 0) | |
(throw (Exception. "invalid nucleotide")))) |
where I defined a set containing only DNA nucleotides, dna-nucleotides, that I used to define the nucletides set using conj. This dna-nucleotides set served to generate the counted-nucleotides map using the zipmap function.
Trying to avoid duplication I discovered several new things about Clojure.
You can nitpick my solution here or see all the exercises I've done so far in this repository.
--------------------------------
Update:
After learning some new stuff, I've been able to simplify the code a bit more:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(ns dna) | |
(def ^:private dna-nucleotides #{\A, \T, \C, \G}) | |
(def ^:private nucleotides (conj dna-nucleotides \U)) | |
(defn nucleotide-counts [strand] | |
(merge {\A 0, \T 0, \C 0, \G 0} (frequencies strand))) | |
(defn count [nucleotide strand] | |
(if (contains? nucleotides nucleotide) | |
(get (nucleotide-counts strand) nucleotide 0) | |
(throw (Exception. "invalid nucleotide")))) |
It turned out that the frequencies function already did the counting out of the box, I jut needed to merge it with the result for an empty strand to make frequencies output conform with what the tests were expecting.
I also made the dna-nucleotides and nucleotides sets private.
You can nitpick this new version here.
No comments:
Post a Comment