Dan Rosenstark, author of MIDI Designer, on Tech & Music

Dan Rosenstark

Dan Rosenstark, Author & CEO of MIDI Designer, muses about all things tech. Particularly: Notes on software development in Swift, Objective-C, and many non-Apple languages. Also: lots of random technology notes on OS X and iOS.

The Whisper Script, Packaged

Overview

This is a script called whisper.

  • Takes .m4a and .mp3 files from a Dropbox directory
  • Has ChatGPT Whisper endpoint transcribe them
  • Has ChatGPT take the Whisper transcription, correct it and make paragraphs
  • Creates a text file in another Dropbox folder
  • Sends and email to Evernote

You need:

  • An OpenAI API key
  • Dropbox folders on your Mac
  • If you use the email functionality, you need SMTP credentials and a 'make to' address for Evernote (or you can comment this part out).

Instructions

You need to change directories and generally think for yourself a bit. If you're not comfortable editing scripts, this might not be for you.

Some notes:

  • These scripts work for me, but my home directory is /Users/yourUser so you need to modify that.
  • Get an OpenAI API key
  • Create the Credentials file
  • Modify the scripts
  • Run whisper on its own first. You might need to handle Python dependencies with pip. ChatGPT can help you.
  • When you run Whisper inside the Launchctl context, you might get lost. Note that it dumps log files in /tmp
  • This blog screws up formatting a bit. You can see the raw Evernote note here. It's Markdown.

Credentials

I store these in $HOME/.secure_env

They should look like this

export OPENAI_API_KEY='' # beware this MIGHT have a dash in it!
export SMTP_SERVER=''
export SMTP_USERNAME=''
export SMTP_PASSWORD=''
export SMTP_PORT=''
export SMTP_SENDER=''
export EVERNOTE_EMAIL=''

The Whisper Script

I stored this in ~/scripts/whisper. Ensure it's executable with chmod +x whisper.

#!/usr/bin/env python3

import datetime
from pathlib import Path
import argparse
from openai import OpenAI
import os, sys
import smtplib

client = OpenAI()

# NOTE: you need this in the environment
# export OPENAI_API_KEY='your key'
# BEWARE of the launch agent running from here
# ~/Library/LaunchAgents/com.confusionstudios.watchvoicedictationfolder.plist
# and that guy runs the whisper-wrapper to handle paths and environment variables

def get_smtp_credentials():
    # Retrieve SMTP server, port, username, and password from environment variables
    smtp_server = os.getenv("SMTP_SERVER")
    smtp_port = os.getenv("SMTP_PORT")
    smtp_username = os.getenv("SMTP_USERNAME")
    smtp_password = os.getenv("SMTP_PASSWORD")
    smtp_sender = os.getenv("SMTP_SENDER")
    return smtp_server, smtp_port, smtp_username, smtp_password, smtp_sender

def send_email_to_evernote(subject, body):
    recipient = os.getenv("EVERNOTE_EMAIL")

    # Get SMTP credentials
    smtp_server, smtp_port, smtp_username, smtp_password, smtp_sender = get_smtp_credentials()

    try:
        # Connect to the SMTP server
        smtp_server = smtplib.SMTP_SSL(smtp_server, smtp_port)
        smtp_server.login(smtp_username, smtp_password)

        # Compose the email message
        message = f"Subject: {subject} @Diary\n\n{body}"

        # Send the email
        smtp_server.sendmail(smtp_sender, recipient, message)

        # Close the connection
        smtp_server.quit()

        print("Email sent successfully.")
    except Exception as e:
        print(f"Error sending email: {e}")

def get_creation_time(file_path):
    return datetime.datetime.fromtimestamp(file_path.stat().st_ctime)

def transcribe_with_whisper_api(file_path):
    print(f"Sending up audio file {file_path.stem} to Whisper for transcription.")
    audio_file = open(file_path, "rb")
    transcription = 
client.audio.transcriptions.create
(
        model="whisper-1", file=audio_file, response_format="text"
    )

    print(f"Got the transcription, length: {len(transcription)}. Now sending to ChatGPT for post-processing.")

    completion = 
client.chat.completions.create
(
        model="gpt-3.5-turbo",
        messages=[
            {
                "role": "system",
                "content": "The following is a transcript. Please make paragraphs out of the transcript. Do not alter it but except to fix homonyms, spelling discrepancies and possible voice transcription mistakes. Feel free to adjust punctuation. Also note the spelling of MIDI Designer. Do not respond to the transcript! THE FOLLOWING IS THE RAW TRANSCRIPT:",
            },
            {
                "role": "user",
                "content": transcription,
            },
        ],
    )

    return completion.choices[0].message.content

def process_file(file_path, output_file, dry_run=False):
    if dry_run:
        print(f"Would transcribe {file_
path.name
} into {output_file}")
    else:
        print(f"Processing {file_
path.name
}...")
        transcript = transcribe_with_whisper_api(file_path)
        with open(output_file, "w") as f:
            f.write(transcript)
        print(f"Generated {output_
file.name
}")
        send_email_to_evernote(output_
file.name
, transcript)
        print(f"Sent Evernote Email with Subject {output_
file.name
}")

def process_files(source_dir, output_dir, dry_run=False):
    # Load list of already processed files
    transcribed_file_path = source_dir / 'transcribed-files.txt'
    if transcribed_file_path.exists():
        with transcribed_file_
path.open
('r') as file:
            processed_files = {line.strip() for line in file}
    else:
        processed_files = set()

    voice_files = list(source_dir.glob("*.m4a")) + list(source_dir.glob("*.mp3"))
    for file_path in voice_files:
        if file_
path.name
 in processed_files:
            print(f"Skipping {file_
path.name
}, already processed.")
            continue

        creation_time = get_creation_time(file_path)
        formatted_time = creation_time.strftime("%Y-%m-%d-%H-%M")
        old_filename_without_extension = file_path.stem

        if old_filename_without_extension.isdigit():
            output_file_name = f"{formatted_time}.txt"
        else:
            output_file_name = f"{formatted_time} - {old_filename_without_extension}.txt"

        output_file = output_dir / output_file_name

        process_file(file_path, output_file, dry_run)

        if not dry_run:
            # Add file to processed list and update file
            processed_files.add(file_
path.name
)
            with transcribed_file_
path.open
('a') as file:
                file.write(file_
path.name
 + '\n')

def parse_arguments():
    parser = argparse.ArgumentParser(
        description="Process voice dictations and generate text files."
    )
    parser.add_argument(
        "--dry-run", action="store_true", help="Print out actions without executing"
    )
    return parser.parse_args()

def main():
    args = parse_arguments()
    source_dir = Path("~/Dropbox/x-fer/voice-dictation").expanduser()
    if not os.path.exists(source_dir):
        print(f"The source directory {source_dir} does not exist.")
        sys.exit(1)

    output_dir = Path("~/Dropbox/x-fer/voice-dictation-output").expanduser()

    output_dir.mkdir(parents=True, exist_ok=True)

    process_files(source_dir, output_dir, dry_run=args.dry_run)

if __name__ == "__main__":
    main()

The Launchctl Command That Loads Whisper Wrapper

This is ~/Library/LaunchAgents/com.confusionstudios.watchvoicedictationfolder.plist

NOTE: You need to adjust paths for everything in this plist except for the /tmp outputs which do give great debugging information.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "
http://www.apple.com/DTDs/PropertyList-1.0.dtd
">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.confusionstudios.watchvoicedictationfolder</string>
    <key>ProgramArguments</key>
    <array>
        <string>/Users/yourUser/scripts/whisper-wrapper</string>
    </array>
    <key>WatchPaths</key>
    <array>
        <string>/Users/yourUser/Dropbox/x-fer/voice-dictation</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>KeepAlive</key>
    <false/>
    <key>StandardOutPath</key>
    <string>/tmp/whisper-wrapper.out</string>
    <key>StandardErrorPath</key>
    <string>/tmp/whisper-wrapper.err</string>
</dict>
</plist>

The Wrapper Script

Launchctl is missing stuff from the environment so you need this wrapper script. Ensure it's executable with chmod +x whisper-wrapper

#!/bin/bash
source $HOME/.secure_env
/opt/homebrew/bin/python3 $HOME/scripts/whisper #note the location of python AND the script, you might have to adjust

Also: you need to name the Python executable path so it can pick up its dependencies.

Version 0.01h

Weakify and Strongify Macros for Objective-C

I worked with Chat to hammer these out. Not sure why everybody on the Internet suggests typeof but anyway...

The correct syntax is indeed __typeof(), not typeof(). The __typeof() keyword is a GCC extension which is supported by both GCC and LLVM/Clang compilers. It allows you to declare a variable of the same type as another variable.

And so...

// weakify: creates a new weak reference 'weakSelf' to the variable (helps avoid retain cycles)
// Usage: weakify(self)
#define weakify(var) __weak __typeof(var) weakSelf = var;

// strongify: creates a new strong reference 'strongSelf' from the weak reference (ensures object stays in memory during block execution)
// Usage: strongify(weakSelf)
#define strongify(var) __strong __typeof(var) strongSelf = var;

Managing Git Submodules Branches When Switching Branches

Is This Written by ChatGPT?

Well, it's not not written by ChatGPT. But it's a collab.

Problem

Working with git submodules presents a unique challenge: when switching branches in the superproject (the main project), the submodule's branch doesn't automatically change. Instead, it stays on the previously checked-out branch. This lack of tracking can lead to inconsistencies between the superproject and the submodule, resulting in confusion and potential issues.

Note About Submodules

Before we delve into the solution, let's clarify the typical use of git submodules. Submodules are designed to allow a Git repository to be a subdirectory of another Git repository, maintaining their commits separately. In essence, submodules pin a specific commit, not a branch, from an external repository into your primary repository.

Therefore, if your submodule's branch needs to track the branch of your main project, it might indicate that these two components should not be separate repositories. However, in certain cases where you find tracking branches useful, the following solution could serve your needs.

Solution

To ensure that the submodule changes to a corresponding branch when switching branches in the superproject, we can create two scripts: one to set (record) the current branch of the submodule when switching branches, and another to restore the submodule to its recorded branch when switching back.

Implementation

In the implementation, we create two Python scripts: git-submodule-branch-set and git-submodule-branch-restore. Here's how they work:

  1. When switching branches in the superproject, run git-submodule-branch-set. This script traverses all submodules, recording their current branch by creating a file in the superproject's .git directory. The filename is a combination of the branch name of the superproject and the path to the submodule, and the file's content is the name of the branch the submodule is currently on.

  2. After switching back to a previous branch in the superproject, run git-submodule-branch-restore. This script goes through all submodules, looking for a corresponding file in the .git directory. If it finds one, it reads the submodule's branch from the file and checks out the submodule to that branch.

git-submodule-branch-set

This script records the current branch of the submodule when you switch branches in the superproject.

#!/usr/bin/env python3
# git-submodule-branch-set

import os
import subprocess

# Get the list of all submodules in the current repository
submodules_output = subprocess.check_output(['git', 'config', '--file', '.gitmodules', '--get-regexp', 'path']).strip().decode('utf8')
submodules = [line.split()[1] for line in submodules_output.split('\n')]

# Get the current branch of the superproject
branch = subprocess.check_output(['git', 'rev-parse', '--abbrev-ref', 'HEAD']).strip().decode('utf8')

for submodule in submodules:
    # Construct the filename where the submodule branch will be stored
    submodule_branch_file = os.path.join('.git', f'submodule_branch_{branch}_{submodule.replace("/", "_")}')

    # Record the current branch of the submodule
    submodule_branch = subprocess.check_output(['git', '-C', submodule, 'rev-parse', '--abbrev-ref', 'HEAD']).strip().decode('utf8')
    with open(submodule_branch_file, 'w') as file:
        file.write(submodule_branch)
    print(f"Submodule '{submodule}' set to branch '{submodule_branch}'")

git-submodule-branch-restore

This script restores the submodule to the recorded branch when you switch back to a previous branch.

#!/usr/bin/env python3
# git-submodule-branch-restore

import os
import subprocess

# Get the list of all submodules in the current repository
submodules_output = subprocess.check_output(['git', 'config', '--file', '.gitmodules', '--get-regexp', 'path']).strip().decode('utf8')
submodules = [line.split()[1] for line in submodules_output.split('\n')]

# Get the current branch of the superproject
branch = subprocess.check_output(['git', 'rev-parse', '--abbrev-ref', 'HEAD']).strip().decode('utf8')

for submodule in submodules:
    # Construct the filename where the submodule branch will be stored
    submodule_branch_file = os.path.join('.git', f'submodule_branch_{branch}_{submodule.replace("/", "_")}')

    # If a file exists that records the branch of the submodule, checkout to that branch in the submodule
    if os.path.isfile(submodule_branch_file):
        with open(submodule_branch_file, 'r') as file:
            submodule_branch = file.read().strip()
        subprocess.check_call(['git', '-C', submodule, 'checkout', submodule_branch])
        print(f"Submodule '{submodule}' restored to branch '{submodule_branch}'")

Why I like Evernote more than Notion


Notion has a whole bunch of concepts. There are lists! Which are not quite the same as pages, but they also are pages. But you can’t put text on them. And, it’s hard to put pages under them, even though lists only contain other pages. Also, you can’t just say "I am going to list all the pages under this page in this page as a list." So it’s very powerful, and very flexible. And you can have links, and back links, and everything can be in multiple places, and… Didn’t I already go through this with Workflowy? It’s so amazing, it’s so flexible, it’s just like your mind, which is amazingly flexible and allows for massive complexity.

Enter Evernote, with its antiquated simple concepts. You have notebooks that have pages in them. You also can have notebook stacks, which can only contain notebooks. You cannot have notebook stacks of notebooks stacks of notebooks. You cannot put notebooks in other notebooks, nor can you put pages directly in notebook stacks. There is no way that I know of to get a list of pages inside a notebook in another page. Unless you want a link to the pages yourself.

So the paradigm is extremely simple.

So for MIDI Designer, for instance, I have a Notebook stack called MIDI Designer. Then, inside that I have a notebook called MD tickets. And another notebook called MD tickets closed. And then I have another Notebook called MIDI Designer knowledge, or MIDI Designer K base. And then I have another Notebook called MIDI Designer marketing. If I was serious about marketing, that might become several notebooks.

And so, Evernote is the ultimate trash bin, but it’s very organized for how to record information. And where information will be later. For instance, for my day job, I have another notebook under the Day-Job Stack called meetings. and another notebook called journal, which contains notes by week, mostly.

Somehow, the simple paradigm lines up nicely and keeps me from feeling overwhelmed and confused. Notion, on the other hand, is so flexible I can never figure out where to go next, and definitely not quickly.

I should note that I do some things outside of Evernote. Or a lot of things. For instance, I use Microsoft To Do for my to do list. And that’s on top of my bullet journal that I keep in Notability on iPad with pencil. And then I have my journal in Day One. I should note that it's important to pick winners in other categories to avoid thrash. I had picked Wunderlist which became Microsoft To Do but... sniff.

Update
Things I do with Evernote that Notion doesn't do (I think):
  • I pay Evernote to be able to email notes to my Evernote notebooks.
  • Evernote can publish individual notes, which is how you're reading this.
  • I use Evernote to publish my blogs, including this one via Postach.io and some that are critical to my business. It's not perfect, clearly.

Things Evernote sucks at:
  • Their document scanner on iOS is the worst, but you can use Genius Scan.
  • Their Apple Pencil app also sucks but you can use any other.
  • They don't do great with code blocks, but at least they don't mangle things like OneNote
  • I haven't used Evernote much for collaboration, so I don't know how it does. It ain't Google Docs, that's for sure.

Swift Watch and Run

This is my Swift "run on change" watcher script.

#!/bin/sh
# swiftWatchAndRun
if [ $# -ne 1 ]; then
    echo "Use like this:"
    echo "   $0 filename-to-watch"
    exit 1
fi
if which fswatch >/dev/null; then
    echo "Watching swift file $1"
    while true; do fswatch --one-event $1 >/dev/null && echo "----------------"; echo `date +"%m-%d-%y %I:%M%p"`; echo "----------------" && swift $1; sleep 0.1; done
else
    echo "You might need to run: brew install fswatch"
fi

WeakArray for the Last Time (Swift 4.2)

I am obsessed with weak arrays. If you're trying to use multiple "protocolized" delegates, you need a weakly-held array of delegates.

This one started with this great article and expanded from there. Big thanks to Varindra Hart for the wise counsel as always.

@objc public protocol Equalable: class {
    @objc func isEqual(_ object: Any?) -> Bool
}

/// Store AnyObject subclasses weakly
/// * Note: if you wish to use a protocol, it must:
///   - be marked with `@objc`
///   - have all methods marked with `@objc`
///   - refine Equalable
public struct WeakArray<Element: Equalable> {
    private var items: [WeakBox<Element>] = []

    public init(_ elements: [Element]? = nil) {
        guard let elements = elements else { return }
        items = elements.map { WeakBox($0) }
    }

    public mutating func append(_ newElement: Element) {
        let box = WeakBox(newElement)
        items.append(box)
    }

    public mutating func remove(_ element: Element) {
        items.removeAll { item in
            return item.unbox?.isEqual(element) ?? false
        }
    }

    public var unboxed: [Element] {
        let filtered = items.filter { $0.unbox != nil }
        return filtered.compactMap { $0.unbox }
    }

    public var boxedCount: Int {
        return items.count
    }
}

extension WeakArray: Collection {
    public var startIndex: Int { return items.startIndex }
    public var endIndex: Int { return items.endIndex }

    public subscript(_ index: Int) -> Element? {
        return items[index].unbox
    }

    public func index(after idx: Int) -> Int {
        return items.index(after: idx)
    }
}

private final class WeakBox<T: Equalable> {
    weak var unbox: T?
    init(_ value: T) {
        unbox = value
    }
}

GitHub Gist

The need to have your protocol conform to @objc is a bug in Swift. See the StackOverflow question about this.

Modifiable: Modify and Assign without Temporary Variables

This Protocol & Extension allow you to assign and modify all in one line.

I've been meaning to write this for a long time. It's working in Swift 4.1. It's definitely a tiny bit tricky, definitely not necessary, and perhaps too cute to be worth it. But it is cute!

import UIKit

/// Modifiable, Dan Rosenstark 2018, inspired from:
/// https://medium.com/@victor.pavlychko/using-self-in-swift-class-extensions-6421dab02587
protocol Modifiable {}

extension NSObject: Modifiable {}

extension Modifiable {
    func modify(block: (Self)->()) -> Self {
        block(self)
        return self
    }
}

/// examples
let view = UIView().modify { $0.backgroundColor = .green }
print(view.backgroundColor == UIColor.green)

Xcodebuild: Valid Destinations on OSX?

To get a list of valid destinations, specify an erroneous key-value pair and xcodebuild will spit out the combinations that work.

List Destinations Command

xcodebuild test -destination 'platform=iOS Simulator' -workspace Register.xcworkspace -scheme ThatTestTarget

Output Example

Available destinations for the "ThatTestTarget" scheme:

{ platform:iOS Simulator, id:145A9B7E-B336-4819-8059-2FFEC408E05E, OS:11.1, name:iPad (5th generation) }
{ platform:iOS Simulator, id:69ABAF6F-ADA3-4E38-AC97-D71001447663, OS:9.3, name:iPad 2 }
{ platform:iOS Simulator, id:550E2F18-406D-4586-84BB-E48F1D704F27, OS:10.3.1, name:iPad Air }
{ platform:iOS Simulator, id:94734F1C-775F-40FA-9015-8196C08805EF, OS:11.1, name:iPad Air }
{ platform:iOS Simulator, id:1DB953DD-CD97-4EC7-8006-BCF01DF3E63F, OS:11.1, name:iPad Air 2 }
{ platform:iOS Simulator, id:DE3072DA-2E31-423D-9D77-220626F8B90A, OS:11.1, name:iPad Pro (9.7-inch) }
{ platform:iOS Simulator, id:3B5D18DB-13B5-4F28-B654-7D2ECDD1F6F0, OS:11.1, name:iPad Pro (10.5-inch) }
{ platform:iOS Simulator, id:A4225E3A-512C-4F42-ADD9-1E7E448C4D27, OS:11.1, name:iPad Pro (12.9-inch) }
{ platform:iOS Simulator, id:684FF1BA-8784-4B7C-B4E5-5231772F0FAC, OS:11.1, name:iPad Pro (12.9-inch) (2nd generation) }

Change Colons for Equals Signs, Remove Spaces, Ignore the ID

So if you want to use this destination:

platform:iOS Simulator, id:684FF1BA-8784-4B7C-B4E5-5231772F0FAC, OS:11.1, name:iPad Pro (12.9-inch) (2nd generation)

Change the colons for commas, remove the spaces, remove the ID, so you get this string:

platform=iOS Simulator,OS=11.1,name=iPad Pro (12.9-inch) (2nd generation)

Then the entire command would be:

xcodebuild test -destination 'platform=iOS Simulator,OS=11.1,name=iPad Pro (12.9-inch) (2nd generation)' -workspace Register.xcworkspace -scheme ThatTestTarget

From my StackOverflow Answer here

Square in Autolayout in Swift 3/4

Sometimes I marvel about how StackOverflow has been taken over by a zealous group of idiots who close questions at will. Here's today's example, and below is my answer, since I can't put it on the SO question itself.

extension UIView {
func constrainToSquareRelativeToView(_ view: UIView, multiplier: CGFloat = 1.0) {
translatesAutoresizingMaskIntoConstraints = false
let widthConstraint = NSLayoutConstraint(item: self, attribute: .width, relatedBy: .lessThanOrEqual, toItem: view, attribute: .width, multiplier: multiplier, constant: 0)
widthConstraint.priority = .defaultLow
let heightConstraint = NSLayoutConstraint(item: self, attribute: .height, relatedBy: .equal, toItem: view, attribute: .height, multiplier: multiplier, constant: 0)
heightConstraint.priority = .defaultLow
let squareConstraint = NSLayoutConstraint(item: self, attribute: .height, relatedBy: .equal, toItem: self, attribute: .width, multiplier: 1.0, constant: 0)
let centerX = NSLayoutConstraint(item: self, attribute: .centerXWithinMargins, relatedBy: NSLayoutRelation.equal, toItem: view, attribute: .centerXWithinMargins, multiplier: 1, constant:0)
let centerY = NSLayoutConstraint(item: self, attribute: .centerYWithinMargins, relatedBy: NSLayoutRelation.equal, toItem: view, attribute: .centerYWithinMargins, multiplier: 1, constant:0)
view.addConstraints([widthConstraint, heightConstraint, squareConstraint, centerX, centerY])
}
}

Here's my result (inside which I'm playing with a UICollectionView, but that's another story):





How I Survive on Any Unix (Bash)

If you must type it all

export EDITOR=nano
PS1='\h:\w\$ '

If you can copy and paste:

export EDITOR=nano

function parse_git_dirty {
[[ $(git status 2> /dev/null | tail -n1) != "nothing to commit, working tree clean" ]] && echo "*"
}
function parse_git_branch {
git branch --no-color 2> /dev/null | sed -e '/^[^*]/d' -e "s/* \(.*\)/(\1$(parse_git_dirty))/"
}
export PS1='\h:\[\033[1;33m\]\w\[\033[0m\]$(parse_git_branch)$ '